This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Target Location Problem for Multi-commodity Flow

Xingwu Liu Institute of Computing Technology, Chinese Academy of Sciences. University of Chinese Academy of Sciences. Beijing, China. Email:[email protected].    Zhida Pan 111Corresponding author Institute of Computing Technology, Chinese Academy of Sciences. University of Chinese Academy of Sciences. Beijing, China. Email:[email protected].    Yuyi Wang ETH Zurich, Switzerland. Email:[email protected].
(April 15, 2020)
Abstract

Motivated by scheduling in Geo-distributed data analysis, we propose a target location problem for multi-commodity flow (LoMuF for short). Given commodities to be sent from their resources, LoMuF aims at locating their targets so that the multi-commodity flow is optimized in some sense. LoMuF is a combination of two fundamental problems, namely, the facility location problem and the network flow problem. We study the hardness and algorithmic issues of the problem in various settings. The findings lie in three aspects. First, a series of NP-hardness and APX-hardness results are obtained, uncovering the inherent difficulty in solving this problem. Second, we propose an approximation algorithm for general undirected networks and an exact algorithm for undirected trees, which naturally induce efficient approximation algorithms on directed networks. Third, we observe separations between directed networks and undirected ones, indicating that imposing direction on edges makes the problem strictly harder. These results show the richness of the problem and pave the way to further studies.

1 Introduction

Nowadays, data is generated geo-distributively at a much higher speed as compared to the existing data transfer speed; for instance, telescopes around the world bring us an unimaginable amount of astronomy data. There are two main reasons for having geo-distributed data: (1) Datacenters (DCs) are built across the globe. (2) Organizations prefer to use multiple clouds to increase reliability, security, and processing. Besides, there exist applications that process and analyze a huge amount of massively geo-distributed data to extract useful information. A typical scenario in processing geo-distributed data is that several analysis tasks are running simultaneously, and each requires a fraction of the collected data [27, 29, 39, 38, 16]. In addition, every analysis task moves needed data to a single location before the computation. Fig. 1 shows an example of geo-distributed telescope data.

The network bandwidth is a crucial factor in geo-distributed data movement and becomes the resource bottleneck. For example, the demand for bandwidth increased from 60 to 290 Tbps between the years 2011 and 2015 while the network capacity growth was not proportional. In 2015, the network capacity growth was only 40 percent, which was the lowest during the years 2011 and 2014222https://www.telegeography.com/researchservices/global-bandwidth-research-service/. When applications (such as electromagnetic radiation and infrared ray analysis), each handling data from some datacenters, have to be deployed, there is no meaningful notion of distance. The latency (travel time of a single small packet) under low-congestion conditions tends not to be noticeable to the end-users. The real difficulty here is the underlying capacity of the network. If links become congested, then the latency will increase and throughput will suffer. A key issue is how to allocate enough bandwidth to each application without causing congestion on the network [40].

Hence, we need to choose proper locations for the tasks to reduce congestion. Specifically, we propose this target location problem for multi-commodity flow: Given sources of multiple commodities on a capacitated network, the goal is to locate the targets to maximize the flow value.

We fix sources because, as our motivating example of geo-distributed data analysis shows, it is difficult, if not impossible, to change datacenters that collect data since for efficiency as these datacenters should be close to data generators. However, it is much more flexible to choose the target locations where the analysis tasks are performed.

The multi-commodity flow problem (MCF) is one of the most fundamental problems with a wide variety of scientific and engineering applications that have been studied intensively [32, 26]. In the most typical scenario, a finite number of commodities have to be sent from their sources to targets on a capacitated network. Each commodity has its own flow, and the commodities interact when their flows compete for capacity on common edges.

There are two general classes of MCF. One is network analysis which, based on a given network configuration, finds the optimal flow pattern for some objective function. The most studied objective functions include maximizing flow values and minimizing flow costs. The other belongs to network synthesis which seeks an optimal network configuration satisfying certain requirements.

In both classes, the targets of the commodities are taken for granted. To our surprise, researches have long neglected how targets are chosen. This paper is devoted to initializing such a theory.

Refer to caption
Figure 1: An overview of geo-distributed astronomy data and corresponding tasks

Our proposed problem extends the MCF framework. It does not belong to either of the two classes. It is a combination of the facility location problem and the network flow problem which are inherently related [1]. Facility location is a branch of operations research related to locating or positioning at least a new facility among several existing facilities to optimize (minimize or maximize) at least one objective function. It is among the most fundamental problems in operations research and theoretical computer science [37]. Facility location met network flow in 1990 [35] and has inspired a series of work [13, 3, 18, 2, 12]. However, all the published works minimize costs of the selected sources or targets (not the flow cost which, together with flow value, is the objective of network flow problems), and never consider the multi-commodity setting. The most crucial difference lies in that our combination is inherent, meaning that the objective is to optimize the flow value, but the literature focuses on the cost of the selected nodes rather than the cost of flow. Another benefit of our framework is that it can naturally extend almost every network flow problem, e.g., flow cost minimization.

Our model has other applications, such as Web server deployment. There are serving various demands from widely-distributed users, for example, requesting for different online video, where should the servers be located so that the users have a good experience? Again we do not care distance, and the key objective is to optimize the available bandwidth. There are more motivating examples, say, network-flow based evacuation planning for an emergency where shelters have to be selected, and congestion decides the efficiency of evacuation. Interested readers are referred to [12].

These real-world scenarios justify our problem’s critical features: The commodities are only partially determined since the targets are not given and have themselves to be optimized, and a decisive factor of the optimization is the bandwidth rather than any notion of distance. This well motivates our problem.

1.1 Results and Discussion

We propose a novel model of the target location for multi-commodity flow (LoMuF). On the one hand, we figure out the hardness results of various versions. On the other hand, we design algorithms for several versions. The results are as follows.

  1. 1.

    We show that the LoMuF problem is NP-hard on general undirected graphs.

    We know that if the targets are fixed, the problem degenerates to one normal multi-commodity flow problem (allowing fractional flows) and becomes tractable in polynomial time, which shows that the most challenging part of this problem is indeed how to locate the targets.

  2. 2.

    We design a polynomial-time algorithm solving LoMuF on trees.

    Trees are important network structures in practice. Compared to the NP-hardness result above, the fact that there is only one path connecting a source and the target on a tree simplifies the problem. Our algorithm is elegant and surprisingly shows that the interaction between different commodities becomes not harmful on trees.

  3. 3.

    We present a max{θ1,1}\max\{\theta-1,1\}-approximation algorithm for LoMuF on general undirected graphs, where θ\theta is the largest source number among all commodities.

    This result actually shows that, when θ2\theta\leq 2 (the so-called bi-source cases) the problem can also be solved efficiently, but (take into account the NP-hardness result above) becomes intractable when θ3\theta\geq 3.

  4. 4.

    For LoMuF on directed graphs (Di-LoMuF), we prove that it is also NP-hard and even cannot be efficiently approximated with a ratio less than 2.

    In fact, we also show that LoMuF on undirected graphs can be reduced to the directed case Di-LoMuF, and then Di-LoMuF should be even harder.

  5. 5.

    Di-LoMuF also remains NP-hard on symmetric di-paths and bi-source supply vectors.

    These are clear separations between undirected LoMuF and Di-LoMuF, since undirected LoMuF is efficiently solvable on trees while Di-LoMuF is even difficult on paths, a very special case of trees. As we pointed out above, the bi-source instances are easy for undirected LoMuF, but not for Di-LoMuF.

  6. 6.

    For the special case on symmetric di-trees, Di-LoMuF has a polynomial-time 2-approximation algorithm.

    Though we have seen several hardness results of Di-LoMuF, for a special but still meaningful subset, where every link has the same capability of downloading and uploading, we can obtain an efficient approximation algorithm.

  7. 7.

    We show that our results above can also be extended to other variants of LoMuF such as maximum sum flows, unsplittable flows, restricted candidate targets, maximum feasible flows, and so on. For the unsplittable version, we show that it cannot be approximated within ratio 2. For the version with restrictions on targets, it is NP-hard on uni-source supply vectors and stars and cannot be efficiently approximated within ratio 76\frac{7}{6} on trees. For the maximum feasible flows version, we prove that for any constant ϵ>0\epsilon>0, unless NP=ZPP, it cannot be approximated within O(k1ϵ)O(k^{1-\epsilon}) on kk supply vectors.

    This shows that the framework of the new location problems has a powerful capability of modeling different scenarios in practice and enriches the theory of location problems and network flow problems.

1.2 Related Work

There is an increasing vast literature on multi-commodity flow and its single-commodity special case [32]. Basically, there are two types of optimization objectives, namely, minimum cost and maximum flow which is the focus of this paper. The main theme of maximum flow in recent years is improving the efficiency of approximation algorithms [26, 14, 6, 21, 28, 23, 15, 33, 5]. The flow-cut duality is also a challenging issue and has attracted much attention from researchers [19, 31].

Facility location has flourished ever since the 1960s and remains an active topic in operations research and theoretical computer science [10, 20, 37, 20]. Though generally, no constant-ratio approximation algorithm exists, it can be constant-approximated on metric spaces. One of the main threads of research is to improve the approximation ratio in various situations [34, 22, 9].

Though inherently related to multi-commodity flow[1], facility location got to be combined with network flow only in 1990 [35], when the source location problem was proposed. Roughly speaking, the mission of the source location problem on a network is to find a set of sources from which enough flow can be sent to each prescribed target. In addition to flow requirements, connectivity and vertex coverage are also frequently used constraints. Work in this line can be classified into two categories. One is independent source location, meaning that the flows to different targets do not interact [3, 30, 4, 24, 25, 13, 3, 18]. The other is simultaneous source location, where the flows concurrently exist and interact by competing edge capacities [2, 12]. An interesting application is emergency evacuation planning [12], where shelters are to be located where residents in a disaster can move to as fast as possible. In such applications, capacities are also usually imposed on network nodes, rather than just on edges in typical network flow models. All the mentioned works have two common features. First, essentially only a single commodity is considered which is multi-source multi-target. Second, the objective is to optimize some measures of the selected sources (say, total cost), rather than the properties of the flow (say, flow value). This is in sharp contrast to our proposed problem.

2 Preliminaries and Problem Statement

In this section, we review key notions and notations used in this paper, and formally define the location problem.

2.1 Preliminaries

Let \mathbb{R} (+,\mathbb{R}_{+},\mathbb{R}_{-}, respectively) represent the set of (non-negative, non-positive, respectively) real numbers. We use x\vec{x} for a vector, and x(y)\vec{x}(y) for its yy-th entry. When we denote a set by an upper-case letter, we usually write the corresponding (subscripted) lower-case letter for the members.

A network is a capacitated graph G=(V,E,c)G=(V,E,\vec{c}), where VV is the vertex set, EE is the edge set, and c+E\vec{c}\in\mathbb{R}_{+}^{E} assigns capacities to the edges. We first mainly focus on undirected graphs in this paper, and will consider directed graphs in Section 4. For any v,vVv,v^{\prime}\in V, we use v,v\langle v,v^{\prime}\rangle, or v,v\langle v^{\prime},v\rangle interchangeably, to denote the edge between v,vv,v^{\prime}. A commodity is described by a demand vector dV\vec{d}\in\mathbb{R}^{V} satisfying vVd(v)=0\sum_{v\in V}\vec{d}(v)=0, where any vv such that d(v)<0\vec{d}(v)<0 (d(v)>0\vec{d}(v)>0, respectively) is called a source (a target, respectively). Intuitively, each source vv has to sent out d(v)\vec{d}(v) units of the commodity, and in total d(u)\vec{d}(u) units are delivered to target uu. The vertex set of a graph GG is denoted by V(G)V(G).

To specify flows over a network, we always arbitrarily orient all the edges and keep the orientation implicit unless necessary. For any vVv\in V, let E(v)E_{-}(v) (E+(v)E_{+}(v), respectively) stands for the set of incoming (outgoing, respectively) edges. A flow is a vector fE\vec{f}\in\mathbb{R}^{E}, which for any edge eEe\in E, means |f(e)||\vec{f}(e)| units of transportation along ee in orientation if f(e)>0\vec{f}(e)>0, and opposite direction otherwise. Given flows f,fE\vec{f},\vec{f}^{\prime}\in\mathbb{R}^{E}, we write ff\vec{f}\lesssim\vec{f}^{\prime} if |f||f||\vec{f}|\leq|\vec{f}^{\prime}| for any eEe\in E. A flow f\vec{f} is said to satisfy a demand vector d\vec{d}, if for any vVv\in V, d(v)=eE(v)f(e)eE+(v)f(e)\vec{d}(v)=\sum_{e\in E_{-}(v)}\vec{f}(e)-\sum_{e\in E_{+}(v)}\vec{f}(e). A multi-commodity flow, which means a set FF of flows, is valid if its congestion fF|f(e)|\sum_{\vec{f}\in F}|\vec{f}(e)| along any edge eEe\in E is at most c(e)\vec{c}(e).

The maximum concurrent problem (MCF for short) has been extensively and is still being actively studied. Specifically, given demand vectors di,1ik\vec{d}_{i},1\leq i\leq k on a capacitated graph GG, the mission of MCF is to find the maximum λ\lambda such that λdi,1ik,\lambda\vec{d}_{i},1\leq i\leq k, can be satisfied by a valid multi-commodity flow on GG. The optimum λ\lambda will be denoted by λ(G;d1,,dk)\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k}).

Let’s recall some properties of MCF.

Lemma 1.

MCF lies in P.

As mentioned in [7, page 863], there is no known purely combinatorial algorithm solving MCF exactly and efficiently. The only commonly used algorithm is based on linear programming.

A multi-commodity flow FF on a capacitated graph GG is said to be a decomposition of flow f\vec{f}, if f(e)=fFf(e)\vec{f}(e)=\sum_{\vec{f}^{\prime}\in F}\vec{f}^{\prime}(e) and |f(e)|=fF|f(e)||\vec{f}(e)|=\sum_{\vec{f}^{\prime}\in F}|\vec{f}^{\prime}(e)| for any edge ee of GG.

Lemma 2.

Arbitrarily fix a demand vector d\vec{d} on a capacitated graph GG. Suppose d\vec{d} has exactly one target tt. Then any flow satisfying d\vec{d} can be decomposed into a multi-commodity flow which satisfies the demand vectors {dv:v is a source of d}\{\vec{d}_{v}:v\textrm{ is a source of }\vec{d}\}. Here, each dv\vec{d}_{v} is such that for any vertex uu of GG,

dv(u)={d(v)if u=vd(v)if u=t0otherwise.\vec{d}_{v}(u)=\begin{cases}\vec{d}(v)&\textrm{if }u=v\\ -\vec{d}(v)&\textrm{if }u=t\\ 0&\textrm{otherwise}\end{cases}.

Note that the decomposition in Lemma 2 is not necessarily unique. Any such one will be called a canonical decomposition of the flow.

Given any vertex subset UU of an graph G=(V,E,c)G=(V,E,\vec{c}), the cut induced by UU, denoted by Cut(U)Cut(U), is defined to be the set of edges bridging UU and VUV\setminus U. Let E(U)=Cut(U)(uUE(u))E_{-}(U)=Cut(U)\bigcap(\bigcup_{u\in U}E_{-}(u)) be the set of edges coming into UU, and E+(U)=Cut(U)E(U)E_{+}(U)=Cut(U)\setminus E_{-}(U).

Lemma 3.

Suppose that f\vec{f} is a flow satisfying a demand vector d\vec{d} on a capacitated graph G=(V,E,c)G=(V,E,\vec{c}). Then for any UVU\subseteq V, eE(U)f(e)eE+(U)f(e)=uUd(u)\sum_{e\in E_{-}(U)}\vec{f}(e)-\sum_{e\in E_{+}(U)}\vec{f}(e)=\sum_{u\in U}\vec{d}(u).

2.2 Target Location problem

Intuitively, our goal is to properly locate targets for multiple commodities. We formulate this problem in this subsection.

Given a capacitated graph G=(V,E,c)G=(V,E,\vec{c}), any sV\vec{s}\in\mathbb{R}_{-}^{V} is called a supply vector on GG. For any supply vector s\vec{s} and vVv\in V, we define a demand vector sv\vec{s}\circ v such that for any uVu\in V,

(sv)(u)={s(u)if uvwV{v}s(w)otherwise.(\vec{s}\circ v)(u)=\begin{cases}\vec{s}(u)&\textrm{if }u\neq v\\ -\sum_{w\in V\setminus\{v\}}\vec{s}(w)&\textrm{otherwise}\end{cases}.

It is time to formulate the problem of target location for maximizing concurrent multi-commodity flow, LoMuF for short. Given supply vectors s1,,sk\vec{s}_{1},\cdots,\vec{s}_{k} on a capacitated graph GG, LoMuF aims at finding v1,,vkv_{1},\cdots,v_{k} such that λ(G;s1v1,,skvk)\lambda(G;\vec{s}_{1}\circ v_{1},\cdots,\vec{s}_{k}\circ v_{k}) is maximized. By abuse of the notation, the optimum objective value is again denoted by λ(G;s1,,sk)\lambda(G;\vec{s}_{1},\cdots,\vec{s}_{k}).

3 Hardness and Algorithms of LoMuF

We begin with studying the hardness of LoMuF. Our work refers to a well-known NP-complete problem, 3-dimensional matching (3-DM for short). Though LoMuF is NP-hard in general, we devise an algorithm solving LoMuF problems on trees efficiently, and show that a simple strategy could be a not-bad solution for graphs with bounded sources.

3.1 Hardness Result

A 3-DM instance is a quadruple (X,Y,Z,W)(X,Y,Z,W), where X,Y,ZX,Y,Z are pairwise disjoint finite sets of equal size, and W{{x,y,z}:xX,yY,zZ}W\subseteq\{\{x,y,z\}:x\in X,y\in Y,z\in Z\}. The goal is to decide whether WW contains a perfect matching, namely, a subset WWW^{\prime}\subseteq W such that |W|=|X||W^{\prime}|=|X| and wWw=XYZ\bigcup_{w\in W^{\prime}}w=X\bigcup Y\bigcup Z? The trivial cases where wWwXYZ\bigcup_{w\in W}w\neq X\bigcup Y\bigcup Z will not be considered.

We first show that LoMuF is NP-hard, which is more or less a surprise, compared with Lemma 1.

Theorem 4.

Given supply vectors s1,,sk\vec{s}_{1},\cdots,\vec{s}_{k} on a capacitated graph GG, it is NP-complete to decide whether λ(G;s1,,sk)1\lambda(G;\vec{s}_{1},\cdots,\vec{s}_{k})\geq 1.

Proof.

Choose a target viv_{i} for supply vectors si\vec{s}_{i}, for any 1ik1\leq i\leq k. Due to Lemma 1, we can use v1,,vkv_{1},\cdots,v_{k} as a certificate to check whether λ(G;s1,,sk)1\lambda(G;\vec{s}_{1},\cdots,\vec{s}_{k})\geq 1. This means that the decision problem lies in NP.

To prove NP-completeness, it suffices to establish a reduction from 3-DM.

Refer to caption
Figure 2: The graph to which 3-DM is reduced.

Given a 3-DM instance (X,Y,Z,W)(X,Y,Z,W) with |X|=k|X|=k and |W|=l|W|=l, we construct an capacitated graph G=(V,E,c)G=(V,E,\vec{c}) as illustrated in Figure 2. Specifically, GG consists of three subgraphs HX,HY,HZH_{X},H_{Y},H_{Z} connected via WW. HXH_{X} is a complete bipartite graph of vertex sets XX and TX={tX,tX}T_{X}=\{t_{X},t^{\prime}_{X}\}, and any xXx\in X is adjacent to wWw\in W if and only if xwx\in w, likewise for HY,HZH_{Y},H_{Z}. All the edges are oriented upward in Figure 2.

As to the capacity, let EE^{\prime} be the set of red edges, namely, those incident to tX,tYt^{\prime}_{X},t^{\prime}_{Y} or tZt^{\prime}_{Z}. For any e=v,tEe=\langle v,t\rangle\in E^{\prime} with vXYZv\in X\bigcup Y\bigcup Z, let We={wW:vw}W_{e}=\{w\in W:v\in w\}. Then for any eEe\in E,

c(e)={|We|1if e=E1otherwise.\vec{c}(e)=\begin{cases}|W_{e}|-1&\textrm{if }e=E^{\prime}\\ 1&\textrm{otherwise}\end{cases}.

We define ll supply vectors d1==dk,dk+1==dl\vec{d}_{1}=\cdots=\vec{d}_{k},\vec{d}_{k+1}=\cdots=\vec{d}_{l} such that for any vVv\in V,

d1(v)={1if v{tX,tY,tZ}0otherwise\vec{d}_{1}(v)=\begin{cases}-1&\textrm{if }v\in\{t_{X},t_{Y},t_{Z}\}\\ 0&\textrm{otherwise}\end{cases}
dk+1(v)={1if v{tX,tY,tZ}0otherwise\vec{d}_{k+1}(v)=\begin{cases}-1&\textrm{if }v\in\{t^{\prime}_{X},t^{\prime}_{Y},t^{\prime}_{Z}\}\\ 0&\textrm{otherwise}\end{cases}

The rest of the proof is devoted to showing that WW has a perfect matching if and only if the LoMuF instance satisfies λ(G;d1,,dl)1\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{l})\geq 1, which will lead to NP-completeness of our decision problem. The proof consists of two parts.

Part 1: a perfect matching in WW implies λ(G;d1,,dl)1\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{l})\geq 1.

Without loss of generality, suppose {w1,,wk}W\{w_{1},\cdots,w_{k}\}\subset W is a perfect matching. For any 1ik1\leq i\leq k, define flow fi\vec{f}_{i} such that for any edge eEe\in E,

fi(e)={1if e is incident to wi, or e=t,u with t{tX,tY,tZ} and uwi0otherwise.\vec{f}_{i}(e)=\begin{cases}1&\textrm{if }e\textrm{ is incident to }w_{i}\textrm{, or }e=\langle t,u\rangle\textrm{ with }t\in\{t_{X},t_{Y},t_{Z}\}\textrm{ and }u\in w_{i}\\ 0&\textrm{otherwise}\end{cases}.

For any k+1jlk+1\leq j\leq l, define flow fj\vec{f}_{j} such that for any edge eEe\in E,

fj(e)={1if e is incident to wj, or e=t,u with t{tX,tY,tZ} and uwj0otherwise.\vec{f}_{j}(e)=\begin{cases}1&\textrm{if }e\textrm{ is incident to }w_{j}\textrm{, or }e=\langle t,u\rangle\textrm{ with }t\in\{t^{\prime}_{X},t^{\prime}_{Y},t^{\prime}_{Z}\}\textrm{ and }u\in w_{j}\\ 0&\textrm{otherwise}\end{cases}.

It is straightforward to check that the multi-commodity flow f1,,fl\vec{f}_{1},\cdots,\vec{f}_{l} is valid and satisfies the demand vectors d1w1,,dlwl\vec{d}_{1}\circ w_{1},\cdots,\vec{d}_{l}\circ w_{l}. Hence, λ(G;d1,,dl)1\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{l})\geq 1.

Part 2: λ(G;d1,,dl)1\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{l})\geq 1 implies a perfect matching in WW.

Suppose the optimum targets are v1,,vlv_{1},\cdots,v_{l}, and the multi-commodity flow f1,,fl\vec{f}_{1},\cdots,\vec{f}_{l} is valid and satisfies d1v1,,dlvl\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{l}\circ v_{l}. Part 2 immediately follows from the two facts:

  1. 1.

    Fact 1: viWv_{i}\in W for any 1il1\leq i\leq l.

    Consider the congestion of any fi\vec{f}_{i} on Cut(W)=Cut(V(HX))Cut(V(HY))Cut(V(HZ))Cut(W)=Cut(V(H_{X}))\bigcup Cut(V(H_{Y}))\bigcup Cut(V(H_{Z})). Let’s proceed case by case.

    • viWv_{i}\in W. Applying Lemma 3 to fi,divi\vec{f}_{i},\vec{d}_{i}\circ v_{i}, we see that the congestion of fi\vec{f}_{i} on Cut(W)Cut(W) is at least 3.

    • viWv_{i}\notin W. Without loss of generality, assume viV(HX)v_{i}\in V(H_{X}). Applying Lemma 3 to fi,divi\vec{f}_{i},\vec{d}_{i}\circ v_{i}, we see that the congestion of fi\vec{f}_{i} on Cut(V(HX))Cut(V(H_{X})) is at least 2, and those on Cut(V(HY))Cut(V(H_{Y})) and Cut(V(HZ))Cut(V(H_{Z})) are both at least 1. Hence, the congestion of fi\vec{f}_{i} on Cut(W)Cut(W) is at least 4.

    Since the total capacity of Cut(W)Cut(W) is 3l3l which upper-bounds the total congestion, we get Fact 1.

  2. 2.

    Fact 2: the sets v1,,vkv_{1},\cdots,v_{k} are pairwise disjoint.

    For contradiction, suppose without loss of generality that v1=w1,v2=w2,x1w1w2v_{1}=w_{1},v_{2}=w_{2},x_{1}\in w_{1}\bigcap w_{2}. Applying Lemma 3 to multi-commodity flow fi,1il\vec{f}_{i},1\leq i\leq l and command vectors divi,1il\vec{d}_{i}\circ v_{i},1\leq i\leq l, we have eCut(W)fi(e)=3l\sum_{e\in Cut(W)}\vec{f}_{i}(e)=3l. This implies that fi(e)=1\vec{f}_{i}(e)=1 for any eCut(W)e\in Cut(W). Namely, each edge in Cut(W)Cut(W) is full of upward flow. Likewise, each edge in Cut(TX)Cut(T_{X}) is also full of upward flow. Let e=tX,x1e=\langle t_{X},x_{1}\rangle. Then we have f1(e)=f2(e)=0f_{1}(e^{\prime})=f_{2}(e^{\prime})=0 for any edge eee^{\prime}\neq e in HXH_{X}, since flow along such an edge can’t reach w1w_{1} or w2w_{2}. This, together with the precondition that f1\vec{f}_{1} satisfies d1v1\vec{d}_{1}\circ v_{1}, implies f1(e)=1\vec{f}_{1}(e)=1. Likewise, f2(e)=1\vec{f}_{2}(e)=1. A contradiction is reached since c(e)=1\vec{c}(e)=1.

3.2 LoMuF on Trees

Theorem 4 indicates that LoMuF is hard to solve on general graphs, but does not exclude the possibility of an efficient algorithm solving LoMuF for some important special case. Indeed, LoMuF on trees allows a fast algorithm, as presented in Algorithm 1. Actually, networks with tree structure is the also the center of related literature [30, 36, 5, 24, 25, 13, 2].

Without loss of generality, trees will be arbitrarily rooted, so the concepts of ancestors, descendants, and subtrees are well defined as usual. Given vertices u,vu,v of a tree, we write uvu\prec v if uu is a descendant of vv, and uvu\preceq v if uvu\prec v or u=vu=v.

Let’s begin with a polynomial-time algorithm, which turns out to exactly solve LoMuF on trees.

Input: a capacitated tree G=(V,E,c)G=(V,E,\vec{c}), supply vectors d1,,dk\vec{d}_{1},\cdots,\vec{d}_{k}
Output: viV,1ikv_{i}\in V,1\leq i\leq k

1:for each 1ik1\leq i\leq k do
2:     Let viv_{i} be the lowest common ancestor of the sources of di\vec{d}_{i}
3:     while there is a child uu of viv_{i} such that vu|di(v)|<vu|di(v)|\sum_{v\not\preceq u}|\vec{d}_{i}(v)|<\sum_{v\preceq u}|\vec{d}_{i}(v)| do
4:         Let viv_{i} be uu      
5:Output(v1,,vkv_{1},\cdots,v_{k})
Algorithm 1 The algorithm for LoMuF on trees.
Theorem 5.

The output of Algorithm 1 is an optimum solution to LoMuF on trees.

Proof.

Given a capacitated tree G=(V,E,c)G=(V,E,\vec{c}) and supply vectors d1,,dkV\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}_{-}^{V}, let v1,,vkv_{1},\cdots,v_{k} be the output of Algorithm 1. Orient any edge of GG upward, i.e., from a vertex to its parent. The theorem is proven in two steps.

Step 1: Arbitrarily fix 1ik1\leq i\leq k. We claim that for any wVw\in V, any λ>0\lambda>0, and any flow f\vec{f} satisfying λdiw\lambda\vec{d}_{i}\circ w, there is a flow ff\vec{f}^{\prime}\lesssim\vec{f} which satisfies λdivi\lambda\vec{d}_{i}\circ v_{i}.

The claim is proved by induction on the hop distance (i.e., the number of edges) between viv_{i} and ww, denoted by dist(vi,w)dist(v_{i},w).

Basis: The claim trivially holds when dist(vi,w)=0dist(v_{i},w)=0.

Hypothesis: The claim holds when dist(vi,w)<δdist(v_{i},w)<\delta.

Induction: dist(vi,w)=δ>0dist(v_{i},w)=\delta>0. Let xx be the lowest common ancestor of the sources of di\vec{d}_{i}. We proceed case by case.

Case 1: viwv_{i}\prec w.

If xwx\prec w, set flow f\vec{f}^{\prime} such that for any edge eEe\in E,

f(e)={f(e)if e lies in the subtree rooted at x0otherwise.\vec{f}^{\prime}(e)=\begin{cases}\vec{f}(e)&\textrm{if }e\textrm{ lies in the subtree rooted at }x\\ 0&\textrm{otherwise}\end{cases}.

One can easily check that ff\vec{f}^{\prime}\lesssim\vec{f} and f\vec{f}^{\prime} satisfies λdix\lambda\vec{d}_{i}\circ x. Let w=xw^{\prime}=x.

If wxw\preceq x, it must happen that vi=wv_{i}=w at the beginning of some “while loop” of Algorithm 1 when handling di\vec{d}_{i}. That loop must assign uu to viv_{i}, where uu is the child of ww satisfying the condition in Line 3. Note that uu lies on the path between ww and the final viv_{i}. Set flow f\vec{f}^{\prime} such that for any edge eEe\in E,

f(e)={λvudi(v)if e=u,wf(e)otherwise.\vec{f}^{\prime}(e)=\begin{cases}\lambda\sum_{v\not\preceq u}\vec{d}_{i}(v)&\textrm{if }e=\langle u,w\rangle\\ \vec{f}(e)&\textrm{otherwise}\end{cases}.

By Lemma 3, we see that f(u,w)=λvu|di(v)|\vec{f}(\langle u,w\rangle)=\lambda\sum_{v\preceq u}|\vec{d}_{i}(v)|. Then the condition in Line 3 implies ff\vec{f}^{\prime}\lesssim\vec{f}. Furthermore, one can check that f\vec{f}^{\prime} satisfies λdiu\lambda\vec{d}_{i}\circ u. Let w=uw^{\prime}=u.

Case 2: wviw\prec v_{i}. Let yy be the child of viv_{i} such that wyw\preceq y. Let Ew,viE_{w,v_{i}} be the edges on the path between ww and viv_{i}. Define flow f\vec{f}^{\prime} such that for any edge eEe\in E,

f(e)={λvu|di(v)|if e=u,uEw,vi with uuf(e)otherwise.\vec{f}^{\prime}(e)=\begin{cases}\lambda\sum_{v\preceq u}|\vec{d}_{i}(v)|&\textrm{if }e=\langle u,u^{\prime}\rangle\in E_{w,v_{i}}\textrm{ with }u\prec u^{\prime}\\ \vec{f}(e)&\textrm{otherwise}\end{cases}.

Since Algorithm 1 outputs viv_{i} rather than yy for di\vec{d}_{i}, it must hold that

vy|di(v)|vy|di(v)|.\displaystyle\sum_{v\not\preceq y}|\vec{d}_{i}(v)|\geq\sum_{v\preceq y}|\vec{d}_{i}(v)|. (1)

For any edge e=u,uEw,vie=\langle u,u^{\prime}\rangle\in E_{w,v_{i}} with uuu\prec u^{\prime}, we have

|f(e)|=λvu|di(v)|by Lemma 3λvy|di(v)|by uyλvy|di(v)|by Inequality (1)λvu|di(v)|by uy=|f(e)|.\begin{array}[]{rll}|\vec{f}(e)|&=\lambda\sum_{v\not\preceq u}|\vec{d}_{i}(v)|&\textrm{by Lemma \ref{le:cutflow}}\\ &\geq\lambda\sum_{v\not\preceq y}|\vec{d}_{i}(v)|&\textrm{by }u\preceq y\\ &\geq\lambda\sum_{v\preceq y}|\vec{d}_{i}(v)|&\textrm{by Inequality \eqref{equa:conditioninline3}}\\ &\geq\lambda\sum_{v\preceq u}|\vec{d}_{i}(v)|&\textrm{by }u\preceq y\\ &=|\vec{f}^{\prime}(e)|.&\end{array}

Hence, ff\vec{f}^{\prime}\lesssim\vec{f}. One can also check that f\vec{f}^{\prime} satisfies λdivi\lambda\vec{d}_{i}\circ v_{i}. Let w=viw^{\prime}=v_{i}.

Case 3: neither wviw\preceq v_{i} nor viwv_{i}\preceq w. Let yy be the lowest common ancestor of ww and viv_{i}. We have either xyx\prec y or yxy\prec x.

If xyx\prec y, define flow f\vec{f}^{\prime} such that for any edge eEe\in E,

f(e)={f(e)if e lies in the subtree rooted at x0otherwise.\vec{f}^{\prime}(e)=\begin{cases}\vec{f}(e)&\textrm{if }e\textrm{ lies in the subtree rooted at }x\\ 0&\textrm{otherwise}\end{cases}.

Then ff\vec{f}^{\prime}\lesssim\vec{f} and f\vec{f}^{\prime} satisfies λdivi\lambda\vec{d}_{i}\circ v_{i}. Let w=xw^{\prime}=x.

If yxy\prec x, yy lies on the path between viv_{i} and xx. Hence, it must happen that vi=yv_{i}=y at the beginning of some “while loop” of Algorithm 1 when handling di\vec{d}_{i}. Then that loop does not choose the subtree of yy containing ww. Follow the argument of Case 2, there is a flow ff\vec{f}^{\prime}\lesssim\vec{f} which satisfies the demand vector λdiy\lambda\vec{d}_{i}\circ y. Let w=yw^{\prime}=y.

Altogether, we always have a flow ff\vec{f}^{\prime}\lesssim\vec{f} which satisfies the demand vector λdiw\lambda\vec{d}_{i}\circ w^{\prime}. Because dist(vi,w)<dist(vi,w)=δdist(v_{i},w^{\prime})<dist(v_{i},w)=\delta, we apply the induction hypothesis and finish step 1.

Step 2: Let λ=λ(G;d1,,dk)\lambda^{*}=\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k}). Choose w1,,wkVw_{1},\cdots,w_{k}\in V such that there is a valid multi-commodity flow f1,,fk\vec{f}_{1},\cdots,\vec{f}_{k} satisfying λd1w1,,λdkwk\lambda^{*}\vec{d}_{1}\circ w_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ w_{k}. For any 1ik1\leq i\leq k, apply the claim in step 1 to wiw_{i} and fi\vec{f}_{i}, resulting in a flow fifi\vec{f}^{\prime}_{i}\lesssim\vec{f}_{i} which satisfies λdivi\lambda^{*}\vec{d}_{i}\circ v_{i}. Therefore, we get a valid multi-commodity flow f1,,fk\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k} satisfying λd1v1,,λdkvk\lambda^{*}\vec{d}_{1}\circ v_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ v_{k}. This means that the output of Algorithm 1 is an optimum solution to LoMuF. ∎

3.3 Approximation Algorithm on General Graphs

Theorem 5 suggests that LoMuF is not extremely intractable, at least in a special case. Fortunately, the tractability can be extended to more general graphs, in the sense of approximation. Let’s begin with a lemma, which shows the important role of master sources (defined below) in approximating LoMuF.

Arbitrarily fix a supply vector d\vec{d} on a capacitated graph G=(V,E,c)G=(V,E,\vec{c}). Arbitrarily choose θ|S|>1\theta\geq|S|>1, where SS is the set of sources of d\vec{d}. Let ww be a master source of d\vec{d}, namely w=argmaxvV|d(v)|w=\operatorname*{\mathop{\arg\max}}_{v\in V}|\vec{d}(v)|.

Lemma 6.

For any uVu\in V and flow f\vec{f} satisfying du\vec{d}\circ u, there is a flow ff\vec{f}^{\prime}\lesssim\vec{f} which satisfies 1θ1dw\frac{1}{\theta-1}\vec{d}\circ w.

Proof.

We proceed case by case.

Case 1: uSu\notin S. For any sSs\in S, define demand vector ds\vec{d}_{s} such that for any vVv\in V,

ds(v)={d(s)if v=sd(s)if v=u0otherwise.\vec{d}_{s}(v)=\begin{cases}\vec{d}(s)&\textrm{if }v=s\\ -\vec{d}(s)&\textrm{if }v=u\\ 0&\textrm{otherwise}\end{cases}.

By Lemma 2, f\vec{f} has a decomposition {fs:sS}\{\vec{f}_{s}:s\in S\} satisfying {ds:sS}\{\vec{d}_{s}:s\in S\}.

Now for any sS{w}s\in S\setminus\{w\}, define flow fs\vec{f}^{\prime}_{s} such that for any eEe\in E,

fs(e)=1θ1(fs(e)fw(e)d(s)d(w)),\vec{f}^{\prime}_{s}(e)=\frac{1}{\theta-1}\left(\vec{f}_{s}(e)-\vec{f}_{w}(e)\frac{\vec{d}(s)}{\vec{d}(w)}\right),

and demand vector ds\vec{d}^{\prime}_{s} such that for any vVv\in V,

ds(v)={d(s)if v=sd(s)if v=w0otherwise.\vec{d}^{\prime}_{s}(v)=\begin{cases}\vec{d}(s)&\textrm{if }v=s\\ -\vec{d}(s)&\textrm{if }v=w\\ 0&\textrm{otherwise}\end{cases}.

Our task is reduced to establishing three claims.

Claim 1: for any sS{w}s\in S\setminus\{w\}, fs\vec{f}^{\prime}_{s} satisfies 1θ1ds\frac{1}{\theta-1}\vec{d}^{\prime}_{s}.

It suffices to show ϕ(v,fs)=1θ1ds(v)\phi(v,\vec{f}^{\prime}_{s})=\frac{1}{\theta-1}\vec{d}^{\prime}_{s}(v) for any vVv\in V, where ϕ(x,g)=eE(x)g(e)eE+(x)g(e)\phi(x,\vec{g})=\sum_{e\in E_{-}(x)}\vec{g}(e)-\sum_{e\in E_{+}(x)}\vec{g}(e), which is the net incoming of flow g\vec{g} at vertex xx. Obviously, ϕ(x,g)\phi(x,\vec{g}) is linear in g\vec{g}.

Arbitrarily fix vVv\in V. By definition of fs\vec{f}^{\prime}_{s},

ϕ(v,fs)=1θ1ϕ(v,fsfwd(s)d(w))=1θ1(ϕ(v,fs)d(s)d(w)ϕ(v,fw))=1θ1(ds(v)d(s)d(w)dw(v))(since fs,fw satisfy ds,dw)=1θ1ds(v)(by definition of ds,dw,ds)\begin{array}[]{rcl}\phi(v,\vec{f}^{\prime}_{s})&=&\frac{1}{\theta-1}\phi\left(v,\vec{f}_{s}-\vec{f}_{w}\frac{\vec{d}(s)}{\vec{d}(w)}\right)\\ &=&\frac{1}{\theta-1}\left(\phi(v,\vec{f}_{s})-\frac{\vec{d}(s)}{\vec{d}(w)}\phi(v,\vec{f}_{w})\right)\\ &=&\frac{1}{\theta-1}\left(\vec{d}_{s}(v)-\frac{\vec{d}(s)}{\vec{d}(w)}\vec{d}_{w}(v)\right)\qquad(\textrm{since }\vec{f}_{s},\vec{f}_{w}\textrm{ satisfy }\vec{d}_{s},\vec{d}_{w})\\ &=&\frac{1}{\theta-1}\vec{d}^{\prime}_{s}(v)\qquad(\textrm{by definition of }\vec{d}_{s},\vec{d}_{w},\vec{d}^{\prime}_{s})\end{array}

Claim 2: f=sS{w}fs\vec{f}^{\prime}=\sum_{s\in S\setminus\{w\}}\vec{f}^{\prime}_{s} satisfies 1θ1dw\frac{1}{\theta-1}\vec{d}\circ w. It immediately follows from Claim 1.

Claim 3: ff\vec{f}^{\prime}\lesssim\vec{f}.

It holds because for any eEe\in E,

|f(e)|=|sS{w}fs(e)|sS{w}|fs(e)|=sS{w}1θ1|fs(e)fw(e)d(s)d(w)|sS{w}|fs(e)|θ1+|fw(e)|θ1sS{w}d(s)d(w)sS{w}|fs(e)|+|fw(e)|=|f(e)|\begin{array}[]{rcl}|\vec{f}^{\prime}(e)|&=&|\sum_{s\in S\setminus\{w\}}\vec{f}^{\prime}_{s}(e)|\\ &\leq&\sum_{s\in S\setminus\{w\}}|\vec{f}^{\prime}_{s}(e)|\\ &=&\sum_{s\in S\setminus\{w\}}\frac{1}{\theta-1}\left|\vec{f}_{s}(e)-\vec{f}_{w}(e)\frac{\vec{d}(s)}{\vec{d}(w)}\right|\\ &\leq&\sum_{s\in S\setminus\{w\}}\frac{|\vec{f}_{s}(e)|}{\theta-1}+\frac{|\vec{f}_{w}(e)|}{\theta-1}\sum_{s\in S\setminus\{w\}}\frac{\vec{d}(s)}{\vec{d}(w)}\\ &\leq&\sum_{s\in S\setminus\{w\}}|\vec{f}_{s}(e)|+|\vec{f}_{w}(e)|\\ &=&|\vec{f}(e)|\end{array}

The proof of Case 1 finishes.

Case 2: u=wSu=w\in S. The lemma trivially holds.

Case 3: uS{w}u\in S\setminus\{w\}.

The proof of Case 1 almost works, except that du\vec{d}_{u} is not well-defined and the decomposition of f\vec{f} does not include fu\vec{f}_{u}. As a result, we still apply the proof of Case 1, after defining fuE\vec{f}_{u}\in\mathbb{R}^{E} and duV\vec{d}_{u}\in\mathbb{R}^{V} to be all-zero vectors. ∎

Remark 1.

Lemma 6 remains true if θ\theta is replaced by ηvVd(v)d(w)\eta\geq\frac{\sum_{v\in V}\vec{d}(v)}{\vec{d}(w)}.

Algorithm 2 is a simple algorithm for LoMuF with guaranteed approximation ratio.

Input: a capacitated graph G=(V,E,c)G=(V,E,\vec{c}), supply vectors d1,,dk\vec{d}_{1},\cdots,\vec{d}_{k}
Output: wiV,1ikw_{i}\in V,1\leq i\leq k

1:for each 1ik1\leq i\leq k do
2:     Output wi=argmaxvV|di(v)|w_{i}=\operatorname*{\mathop{\arg\max}}_{v\in V}|\vec{d}_{i}(v)| as the target of di\vec{d}_{i}
Algorithm 2 An approximation algorithm for LoMuF.
Theorem 7.

Algorithm 2 is max{θ1,1}\max\{\theta-1,1\}-approximate, where θ=max1ik|{vV:di(v)<0}|\theta=\max_{1\leq i\leq k}|\{v\in V:\vec{d}_{i}(v)<0\}|.

Proof.

Arbitrarily fix a capacitated graph G=(V,E,c)G=(V,E,\vec{c}) and supply vectors d1,,dkV\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}_{-}^{V} as input to Algorithm 2. Let w1,,wkw_{1},\cdots,w_{k} be the output. If θ=1\theta=1, each wiw_{i} the unique source of di\vec{d}_{i}, which is trivially optimum. Hence, we assume θ>1\theta>1 and show that establish approximation ratio θ1\theta-1.

Let λ=λ(G;d1,,dk)\lambda^{*}=\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k}). Suppose u1,,ukVu_{1},\cdots,u_{k}\in V is an optimum solution to LoMuF. This means that there is a multi-commodity flow {f1,,fk}\{\vec{f}_{1},\cdots,\vec{f}_{k}\} satisfying {λd1u1,,λdkuk}\{\lambda^{*}\vec{d}_{1}\circ u_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ u_{k}\}.

For any 1ik1\leq i\leq k, apply Lemma 6 with w=wi,u=ui,f=fi,d=λdiw=w_{i},u=u_{i},\vec{f}=\vec{f}_{i},\vec{d}=\lambda^{*}\vec{d}_{i}, getting a flow fifi\vec{f}^{\prime}_{i}\lesssim\vec{f}_{i} which satisfies λθ1diwi\frac{\lambda^{*}}{\theta-1}\vec{d}_{i}\circ w_{i}. As a result, we find a valid multi-commodity flow {f1,,fk}\{\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}\} satisfying {λθ1d1w1,,λθ1diwi}\{\frac{\lambda^{*}}{\theta-1}\vec{d}_{1}\circ w_{1},\cdots,\frac{\lambda^{*}}{\theta-1}\vec{d}_{i}\circ w_{i}\}, so λ(G;d1w1,,dkwk)λθ1\lambda(G;\vec{d}_{1}\circ w_{1},\cdots,\vec{d}_{k}\circ w_{k})\geq\frac{\lambda^{*}}{\theta-1}. The proof ends. ∎

Remark 2.

By applying Remark 1 rather than Lemma 6, Theorem 7 remains true if θ\theta is replaced by η=max1ikvVdi(v)di(wi)\eta=\max_{1\leq i\leq k}\frac{\sum_{v\in V}\vec{d}_{i}(v)}{\vec{d}_{i}(w_{i})} which is not bigger than θ\theta. Hereunder, this η\eta will be called concentration of the supply vectors. It intuitively indicates how much demands are concentrated on sources.

Note that in Remark 2, η1\eta\leq 1 if the wiw_{i}-entry dominates di\vec{d}_{i} for any 1ik1\leq i\leq k, namely |di(wi)|vwi|di(v)||\vec{d}_{i}(w_{i})|\geq\sum_{v\neq w_{i}}|\vec{d}_{i}(v)|. A special such case is when every supply vector has no more than 2 sources. Then by Remark 2, we immediately have the following corollary.

Corollary 8.

When every supply vector has a dominant entry, Algorithm 2 exactly solves LoMuF.

4 Hardness and Algorithms of Di-LoMuF

In this section, we adapt LoMuF to networks modeled as directed graphs. Such networks have also been studied in the network flow community and frequently appear in nowadays practice. For example, only down-streaming traffics are allowed by many data servers.

We adopt the notation and concepts in Section 2 in case of no ambiguity, with three exceptions:

  • Every edge has an inherent direction and is called an arc. An arc from vertex uu to vertex vv is denoted by (u,v)(u,v). We usually use G=(V,A,c)G=(V,A,\vec{c}) to represent a capacitated directed GG with vertex set VV, arc set AA, and capacity vector c+A\vec{c}\in\mathbb{R}_{+}^{A}. Accordingly, A(v)A_{-}(v) (A+(v)A_{+}(v), respectively) stands for the set of incoming (outgoing, respectively) arcs at vertex vv. Likewise, define A(U)A_{-}(U) and A+(U)A_{+}(U) for vertex subset UVU\subseteq V.

  • Any arc only allows a flow in the inherent direction, so we can naturally specify a network flow using a non-negative vector f+A\vec{f}\in\mathbb{R}_{+}^{A}.

  • We continue to study the problem of target location for maximizing concurrent multi-commodity flow, but in the context of directed graphs. The problem will be called Di-LoMuF to highlight the directed model.

Note that Lemmas 1-3 still hold in the context of the directed graph model.

The following theorem indicates the strong relation between LoMuF and Di-LoMuF.

Theorem 9.

LoMuF is reducible to Di-LoMuF.

Proof.

Arbitrarily fix an capacitated graph G=(V,E,c)G=(V,E,\vec{c}) and supply vectors d1,,dkV\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}^{V}. We will construct a capacitated direct graph G=(V,A,c)G^{\prime}=(V^{\prime},A,\vec{c}^{\prime}) and supply vectors d1,,dkV\vec{d}^{\prime}_{1},\cdots,\vec{d}^{\prime}_{k}\in\mathbb{R}^{V^{\prime}}, and prove that the construction preserves the quality of solutions.

Step 1: Construct GG^{\prime} and the supply vectors.

The directed graph GG^{\prime} is obtained by replacing any edge of GG with the diamond gadget as illustrated in Figure 3. Specifically, V=V{se,te:eE}V^{\prime}=V\bigcup\{s_{e},t_{e}:e\in E\}, A={(u,se),(v,se),(se,te),(te,u),(te,v):e=u,vE}A=\{(u,s_{e}),(v,s_{e}),(s_{e},t_{e}),(t_{e},u),(t_{e},v):e=\langle u,v\rangle\in E\}, and for any arc aa in the diamond corresponding to edge ee, c(a)=c(e)\vec{c}^{\prime}(a)=\vec{c}(e). For any 1ik1\leq i\leq k, define di\vec{d}^{\prime}_{i} such that for any vVv\in V^{\prime},

di(v)={di(v)if vV0otherwise.\vec{d}^{\prime}_{i}(v)=\begin{cases}\vec{d}_{i}(v)&\textrm{if }v\in V\\ 0&\textrm{otherwise}\end{cases}.
Refer to caption
Figure 3: The gadget for reducing LoMuF to Di-LoMuF.

Step 2: Prove that for any v1,,vkVv_{1},\cdots,v_{k}\in V, λ(G;d1v1,,dkvk)λ(G;d1v1,,dkvk)\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k})\leq\lambda(G^{\prime};\vec{d}^{\prime}_{1}\circ v_{1},\cdots,\vec{d}^{\prime}_{k}\circ v_{k}).

Consider any λ\lambda and any valid multi-commodity flow {f1,,fk}\{\vec{f}_{1},\cdots,\vec{f}_{k}\} satisfying {λd1v1,,λdkvk}\{\lambda\vec{d}_{1}\circ v_{1},\cdots,\lambda\vec{d}_{k}\circ v_{k}\}. For any 1ik1\leq i\leq k, define flow fi\vec{f}^{\prime}_{i} as follows: for any e=u,vEe=\langle u,v\rangle\in E, if fi(e)\vec{f}_{i}(e) is from uu to vv, set fi(u,se)=fi(se,te)=fi(te,v)=|fi(e)|\vec{f}^{\prime}_{i}(u,s_{e})=\vec{f}^{\prime}_{i}(s_{e},t_{e})=\vec{f}^{\prime}_{i}(t_{e},v)=|\vec{f}_{i}(e)|, otherwise set fi(v,se)=fi(se,te)=fi(te,u)=|fi(e)|\vec{f}^{\prime}_{i}(v,s_{e})=\vec{f}^{\prime}_{i}(s_{e},t_{e})=\vec{f}^{\prime}_{i}(t_{e},u)=|\vec{f}_{i}(e)|; fi(a)=0\vec{f}^{\prime}_{i}(a)=0 for any other arc aa. It is straightforward to check that the multi-commodity flow {f1,,fk}\{\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}\} is valid and satisfies {λd1v1,,λdkvk}\{\lambda\vec{d}^{\prime}_{1}\circ v_{1},\cdots,\lambda\vec{d}^{\prime}_{k}\circ v_{k}\}

Step 3: Prove that for any v1,,vkVv^{\prime}_{1},\cdots,v^{\prime}_{k}\in V^{\prime}, there are v1,,vkVv_{1},\cdots,v_{k}\in V such that λ(G;d1v1,,dkvk)λ(G;d1v1,,dkvk)\lambda(G^{\prime};\vec{d}^{\prime}_{1}\circ v^{\prime}_{1},\cdots,\vec{d}^{\prime}_{k}\circ v^{\prime}_{k})\leq\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k}).

Consider any λ\lambda and any valid multi-commodity flow {f1,,fk}\{\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}\} satisfying {λd1v1,,λdkvk}\{\lambda\vec{d}^{\prime}_{1}\circ v^{\prime}_{1},\cdots,\lambda\vec{d}^{\prime}_{k}\circ v^{\prime}_{k}\}. For any 1ik1\leq i\leq k, define flow fi\vec{f}_{i} as follows. For any e=u,vEe=\langle u,v\rangle\in E oriented from uu to vv, we deal case by case:

  • When vi{se,te}v^{\prime}_{i}\notin\{s_{e},t_{e}\}, set fi(e)=fi(u,se)fi(v,se)\vec{f}_{i}(e)=\vec{f}^{\prime}_{i}(u,s_{e})-\vec{f}^{\prime}_{i}(v,s_{e}).

  • When vi{se,te}v^{\prime}_{i}\in\{s_{e},t_{e}\}, set fi(e)=fi(u,se)\vec{f}_{i}(e)=\vec{f}^{\prime}_{i}(u,s_{e}) if fi(u,se)<fi(v,se)\vec{f}^{\prime}_{i}(u,s_{e})<\vec{f}^{\prime}_{i}(v,s_{e}), otherwise fi(e)=fi(v,se)\vec{f}_{i}(e)=-\vec{f}^{\prime}_{i}(v,s_{e}).

Now for any 1ik1\leq i\leq k, we find a proper viVv_{i}\in V. This is also done case by case:

  • When viVv^{\prime}_{i}\in V, let vi=viv_{i}=v^{\prime}_{i}.

  • When vi{se,te}v^{\prime}_{i}\in\{s_{e},t_{e}\} for e=u,ve=\langle u,v\rangle, let vi=uv_{i}=u if fi(u,se)>fi(v,se)\vec{f}^{\prime}_{i}(u,s_{e})>\vec{f}^{\prime}_{i}(v,s_{e}), otherwise vi=vv_{i}=v.

Again, it is easy to check that the multi-commodity flow {f1,,fk}\{\vec{f}_{1},\cdots,\vec{f}_{k}\} is valid and satisfies {λd1v1,,λdkvk}\{\lambda\vec{d}_{1}\circ v_{1},\cdots,\lambda\vec{d}_{k}\circ v_{k}\}. ∎

Remark 3.

Theorem 9 implies that Di-LoMuF is at least as hard as LoMuF. Together with Theorem 4, Di-LoMuF is also NP-hard. More importantly, the reduction in the above proof preserves approximation ratio: any α\alpha-approximation algorithm of Di-LoMuF, combined with the reduction, also α\alpha-approximately solves LoMuF.

We further show that Di-LoMuF has no PTAS.

Theorem 10.

Unless P=NP, Di-LoMuF cannot be efficiently approximated with a ratio smaller than 2.

Proof.

We establish a reduction from 3-DM to Di-LoMuF and show that the solutions to Di-LoMuF has a big gap indicating whether or not a perfect matching exists.

Arbitrarily fix an instance (X,Y,Z,W)(X,Y,Z,W) of 3-DM. We construct an capacitated directed graph G=(V,A,c)G=(V,A,\vec{c}) and k=|X|k=|X| supply vectors d1,,dk\vec{d}_{1},\cdots,\vec{d}_{k}. Specifically, as illustrated in Figure 4, GG is adapted from the undirected graph in Figure 2, up to two modifications:

  • The red parts, namely, vertices tX,tY,tZt^{\prime}_{X},t^{\prime}_{Y},t^{\prime}_{Z} and their incident edges, are removed.

  • All the arcs are directed upward, as indicated by the arrows.

All the arcs has capacity 1. Define supply vectors d1==dk\vec{d}_{1}=\cdots=\vec{d}_{k} such that for any vVv\in V,

d1(v)={1if v{tX,tY,tZ}0otherwise.\vec{d}_{1}(v)=\begin{cases}-1&\textrm{if }v\in\{t_{X},t_{Y},t_{Z}\}\\ 0&\textrm{otherwise}\end{cases}.
Refer to caption
Figure 4: The directed graph to which the 3-DM is reduced.

Our theorem immediately holds if we have the following two facts:

Fact 1: If WW contains a perfect matching, λ(G;d1,,dk)1\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})\geq 1.

To prove this fact, suppose without loss of generality that {w1,,wk}\{w_{1},\cdots,w_{k}\} is a perfect matching in WW. For any 1ik1\leq i\leq k, assume wi={x,y,z}w_{i}=\{x,y,z\} with xX,yY,zZx\in X,y\in Y,z\in Z, and define a flow fi\vec{f}_{i} such that for any arc aAa\in A,

fi(a)={1if a{(tX,x),(x,wi),(tY,y),(y,wi),(tZ,z),(z,wi)}0otherwise.\vec{f}_{i}(a)=\begin{cases}1&\textrm{if }a\in\{(t_{X},x),(x,w_{i}),(t_{Y},y),(y,w_{i}),(t_{Z},z),(z,w_{i})\}\\ 0&\textrm{otherwise}\end{cases}.

It is straightforward to check that the multi-commodity flow {fi:1ik}\{\vec{f}_{i}:1\leq i\leq k\} is valid and satisfies {diwi:1ik}\{\vec{d}_{i}\circ w_{i}:1\leq i\leq k\}. Hence, λ(G;d1,,dk)1\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})\geq 1.

Fact 2: If WW contains no perfect matching, λ=λ(G;d1,,dk)12\lambda^{*}=\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})\leq\frac{1}{2}.

Let v1,,vkVv_{1},\cdots,v_{k}\in V be such that λ(G;d1v1,,dkvk)=λ\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k})=\lambda^{*}. One immediately sees that viWv_{i}\in W for any 1ik1\leq i\leq k, unless λ=0\lambda^{*}=0. Without loss of generality, assume that vi=wiv_{i}=w_{i} for any 1ik1\leq i\leq k.

Let {fi:1ik}\{\vec{f}_{i}:1\leq i\leq k\} be a valid multi-commodity flow that satisfies {λdiwi:1ik}\{\lambda^{*}\vec{d}_{i}\circ w_{i}:1\leq i\leq k\}.

Since {wi:1ik}\{w_{i}:1\leq i\leq k\} is not a perfect matching, there must be vXYZv\in X\bigcup Y\bigcup Z such that |{i:1ik,vwi}|2|\{i:1\leq i\leq k,v\in w_{i}\}|\geq 2. Again without loss of generality, assume that v=x1Xv=x_{1}\in X and vw1w2v\in w_{1}\bigcap w_{2}. For any i{1,2}i\in\{1,2\}, one can observe that fi(tX,xj)=0\vec{f}_{i}(t_{X},x_{j})=0 for any 2jk2\leq j\leq k, because a flow on such an arc can not reach w1w_{1} or w2w_{2}.

Then by Lemma 3, f1(a)=f2(a)=λ\vec{f}_{1}(a)=\vec{f}_{2}(a)=\lambda^{*} where a=(tX,x1)a=(t_{X},x_{1}). Considering that 1=c(a)f1(a)+f2(a)1=\vec{c}(a)\geq\vec{f}_{1}(a)+\vec{f}_{2}(a), we have λ12\lambda^{*}\leq\frac{1}{2}. ∎

To investigate the borderline of the intractability of Di-LoMuF, one might impose restrictions on instances to make them simple. One dimension of simplification is to upper bound the source number of the supply vectors. When every supply vector has only one source, the sources altogether form a trivial optimum solution to Di-LoMuF. Hence it is reasonable to focus on bi-source supply vectors, namely those each having at most two sources. Another dimension of simplification is to focus on simple graphs, so directed trees (called di-trees) are natural candidates. A di-tree is a directed graph which, after removing the directions of the arcs and neglecting multi-edges, becomes an undirected tree. A di-path can be defined likewise. To make our result as strong as possible, we further require that the di-trees are symmetric. A capacitated directed graph is called symmetric, if (1) all arcs have equal capacity, and (2) once having an arc (u,v)(u,v), it also has the twin arc (v,u)(v,u). We will show that Di-LoMuF remains hard even on these nearly trivial instances.

Before continuing, recall the 3-partition problem, which is well-known to be strongly NP-hard [8, page 99]. An instance of the 3-partition problem is a multi-set SS of positive integers with |S|=3m|S|=3m for some integer mm. The objective is to decide whether SS has an equi-partition, namely a partition S1,,SmS_{1},\cdots,S_{m} of SS such that sSis=sSjs\sum_{s\in S_{i}}s=\sum_{s\in S_{j}}s for any 1i,jm1\leq i,j\leq m.

Theorem 11.

Di-LoMuF is NP-hard on symmetric di-paths and bi-source supply vectors

Proof.

We prove the theorem via a reduction from 3-partition problems to Di-LoMuF. For this end, given an instance S={s1,,s3m}S=\{s_{1},\cdots,s_{3m}\} of 3-partition problem, we set about to construct a symmetric di-path and (5m2)(5m-2) bi-source supply vectors.

Refer to caption
Figure 5: The symmetric di-path for reducing 3-partition problem.

Specifically, as illustrated in Figure 5, the di-path G=(V,A,c)G=(V,A,\vec{c}) consists of mm vertices v1,,vmv_{1},\cdots,v_{m} and arcs ai=(vi,vi+1)a_{i}=(v_{i},v_{i+1}) and ai+1=(vi+1,vi)a^{\prime}_{i+1}=(v_{i+1},v_{i}) for any 1i<m1\leq i<m. Each arc has capacity mBmB, where B=sSsmB=\frac{\sum_{s\in S}s}{m}. For any 1i3m1\leq i\leq 3m and 1j<m1\leq j<m, define supply vectors di,dj,dj′′\vec{d}_{i},\vec{d}^{\prime}_{j},\vec{d}^{\prime\prime}_{j} such that for any vVv\in V,

di(v)={siif v{v1,vm}0otherwise\vec{d}_{i}(v)=\begin{cases}-s_{i}&\textrm{if }v\in\{v_{1},v_{m}\}\\ 0&\textrm{otherwise}\end{cases}
dj(v)={(mB+1)if v=vj(mj)Bif v=vj+10otherwise,\vec{d}^{\prime}_{j}(v)=\begin{cases}-(mB+1)&\textrm{if }v=v_{j}\\ -(m-j)B&\textrm{if }v=v_{j+1}\\ 0&\textrm{otherwise}\end{cases},
dj′′(v)={jBif v=vj(mB+1)if v=vj+10otherwise.\vec{d}^{\prime\prime}_{j}(v)=\begin{cases}-jB&\textrm{if }v=v_{j}\\ -(mB+1)&\textrm{if }v=v_{j+1}\\ 0&\textrm{otherwise}\end{cases}.

For notational simplicity, we sometimes use d3m+1,,d5m2\vec{d}_{3m+1},\cdots,\vec{d}_{5m-2} to stand for d1,,dm1,d1′′,,dm1′′\vec{d}^{\prime}_{1},\cdots,\vec{d}^{\prime}_{m-1},\vec{d}^{\prime\prime}_{1},\cdots,\vec{d}^{\prime\prime}_{m-1}, respectively.

Our proof will be done in two steps.

Step 1. If SS has an equi-partition, then λ(G;d1,,d5m2)1\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{5m-2})\geq 1.

Let S1,,SmS_{1},\cdots,S_{m} be an equi-partition of SS. For any 1i3m1\leq i\leq 3m, let 1jm1\leq j\leq m satisfy siSjs_{i}\in S_{j}, and we define flow fi\vec{f}_{i} such that for any aAa\in A,

fi(a)={siif a{ak:1k<j}{ak:j<km}0otherwise.\vec{f}_{i}(a)=\begin{cases}s_{i}&\textrm{if }a\in\{a_{k}:1\leq k<j\}\bigcup\{a^{\prime}_{k}:j<k\leq m\}\\ 0&\textrm{otherwise}\end{cases}.

One can check that fi\vec{f}_{i} satisfies demand vector divj\vec{d}_{i}\circ v_{j}.

For any 1im11\leq i\leq m-1, define flows fi,fi′′\vec{f}^{\prime}_{i},\vec{f}^{\prime\prime}_{i} such that for any aAa\in A,

fi(a)={(mi)Bif a=ai+10otherwise,\vec{f}^{\prime}_{i}(a)=\begin{cases}(m-i)B&\textrm{if }a=a^{\prime}_{i+1}\\ 0&\textrm{otherwise}\end{cases},
fi′′(a)={iBif a=ai0otherwise.\vec{f}^{\prime\prime}_{i}(a)=\begin{cases}iB&\textrm{if }a=a_{i}\\ 0&\textrm{otherwise}\end{cases}.

Obviously, fi\vec{f}^{\prime}_{i} satisfies divi\vec{d}^{\prime}_{i}\circ v_{i}, and fi′′\vec{f}^{\prime\prime}_{i} satisfies di′′vi+1\vec{d}^{\prime\prime}_{i}\circ v_{i+1}.

It is straightforward to check that all these flows form a valid multi-commodity flow. Altogether, we have λ(G;d1,,d5m2)1\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{5m-2})\geq 1.

Step 2. If λ(G;d1,,d5m2)1\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{5m-2})\geq 1, SS has an equi-partition.

Suppose ui,uj,uj′′V,1i3m,1jm1,u_{i},u^{\prime}_{j},u^{\prime\prime}_{j}\in V,1\leq i\leq 3m,1\leq j\leq m-1, are such that there is a valid multi-commodity flow fi,fj,fj′′,1i3m,1jm1,\vec{f}_{i},\vec{f}^{\prime}_{j},\vec{f}^{\prime\prime}_{j},1\leq i\leq 3m,1\leq j\leq m-1, satisfying demand vectors diui,djuj,dj′′uj′′,1i3m,1jm1\vec{d}_{i}\circ u_{i},\vec{d}^{\prime}_{j}\circ u^{\prime}_{j},\vec{d}^{\prime\prime}_{j}\circ u^{\prime\prime}_{j},1\leq i\leq 3m,1\leq j\leq m-1.

For any 1im1\leq i\leq m, let Vi={v1,vi}V_{i}=\{v_{1}\cdots,v_{i}\}. We proceed in two substeps.

Step 2.1. For any 1i<m1\leq i<m, ui=viu^{\prime}_{i}=v_{i} and ui′′=vi+1u^{\prime\prime}_{i}=v_{i+1}.

Arbitrary fix 1i<m1\leq i<m. For contradiction, assume that uiViu^{\prime}_{i}\notin V_{i}. Applying Lemma 3 to fi,diui,Vi\vec{f}^{\prime}_{i},\vec{d}^{\prime}_{i}\circ u^{\prime}_{i},V_{i}, one get fi(ai)mB+1\vec{f}^{\prime}_{i}(a_{i})\geq mB+1, contradictory to the fact that c(ai)=mB\vec{c}(a_{i})=mB. Hence, uiViu^{\prime}_{i}\in V_{i}. Likewise, one can further show that uiVi1u^{\prime}_{i}\notin V_{i-1}. As a result, ui=viu^{\prime}_{i}=v_{i}.

In a similar way, we also have ui′′=vi+1u^{\prime\prime}_{i}=v_{i+1}.

Step 2.2. SS has an equi-partition.

Arbitrarily fix 1i<m1\leq i<m. Applying Lemma 3 to fi,diui,Vi\vec{f}^{\prime}_{i},\vec{d}^{\prime}_{i}\circ u^{\prime}_{i},V_{i} and to fi′′,di′′ui′′,Vi\vec{f}^{\prime\prime}_{i},\vec{d}^{\prime\prime}_{i}\circ u^{\prime\prime}_{i},V_{i} respectively, one gets

fi(ai+1)(mi)B,fi′′(ai)iB.\displaystyle\vec{f}^{\prime}_{i}(a^{\prime}_{i+1})\geq(m-i)B,\vec{f}^{\prime\prime}_{i}(a_{i})\geq iB. (2)

Let Ji={j:1j3m,ujVi}J_{i}=\{j:1\leq j\leq 3m,u_{j}\in V_{i}\}. For any jJij\in J_{i}, apply Lemma 3 to fj,djuj,Vi\vec{f}_{j},\vec{d}_{j}\circ u_{j},V_{i}, and we have

fj(ai+1)sj.\displaystyle\vec{f}_{j}(a^{\prime}_{i+1})\geq s_{j}. (3)

Likewise, for any jJij\notin J_{i}, applying Lemma 3 to fj,djuj,Vi\vec{f}_{j},\vec{d}_{j}\circ u_{j},V_{i} results in

fj(ai)sj.\displaystyle\vec{f}_{j}(a_{i})\geq s_{j}. (4)

Then,

2mB=1j3msj+iB+(mi)B=jJisj+jJisj+iB+(mi)BjJifj(ai+1)+jJifj(ai)+fi′′(ai)+fi(ai+1) by (2)-(4)c(ai)+c(ai+1)=2mB by capacity constraints\begin{split}\begin{array}[]{rll}2mB&=\sum_{1\leq j\leq 3m}s_{j}+iB+(m-i)B&\\ &=\sum_{j\in J_{i}}s_{j}+\sum_{j\notin J_{i}}s_{j}+iB+(m-i)B&\\ &\leq\sum_{j\in J_{i}}\vec{f}_{j}(a^{\prime}_{i+1})+\sum_{j\notin J_{i}}\vec{f}_{j}(a_{i})&\\ &\quad+\vec{f}^{\prime\prime}_{i}(a_{i})+\vec{f}^{\prime}_{i}(a^{\prime}_{i+1})&\qquad\textrm{ by (\ref{equa:occupycapa1})-(\ref{equa:occupycapa3})}\\ &\leq\vec{c}(a_{i})+\vec{c}(a^{\prime}_{i+1})=2mB&\qquad\textrm{ by capacity constraints}\end{array}\end{split} (5)

As a result, all the inequalities in (2)-(5) are actually equalities. Hence,

jJisj=jJifj(ai+1)=mBfi(ai+1)=iB.\sum_{j\in J_{i}}s_{j}=\sum_{j\in J_{i}}\vec{f}_{j}(a^{\prime}_{i+1})=mB-\vec{f}^{\prime}_{i}(a^{\prime}_{i+1})=iB.

Let J0=J_{0}=\emptyset and Jm={j:1j3m}J_{m}=\{j:1\leq j\leq 3m\}. For any 1im1\leq i\leq m, define Si={sj:jJiJi1}S_{i}=\{s_{j}:j\in J_{i}\setminus J_{i-1}\}, which satisfies sSis=jJiJi1sj=jJisjjJi1sj=B\sum_{s\in S_{i}}s=\sum_{j\in J_{i}\setminus J_{i-1}}s_{j}=\sum_{j\in J_{i}}s_{j}-\sum_{j\in J_{i-1}}s_{j}=B. This means that S1,,SmS_{1},\cdots,S_{m} is an equi-partition of SS.∎

Remark 4.

Recall Corollary 8 which implies the tractability of LoMuF on bi-source supply vectors. It is in sharp contrast to the intractability of Di-LoMuF in this situation. Furthermore, Theorem 5 claims that LoMuF is polynomial-time solvable when the input graph is a tree, but Di-LoMuF remains NP-hard even on symmetric di-paths. These serves as an evidence that LoMuF is generally harder than LoMuF.

We have seen the hardness of Di-LoMuF even in the nearly-trivial cases. Fortunately, the next theorem will relieve us from frustration, because it indicates the possibility to approximately solve Di-LoMuF. A new definition is needed.

Given a capacitated directed graph G=(V,A,c)G=(V,A,\vec{c}), for any u,vVu,v\in V, let A{u,v}={(u,v),(v,u)}AA_{\{u,v\}}=\{(u,v),(v,u)\}\bigcap A. Define the induced graph of GG to be the capacitated undirected graph G=(V,E,c)G^{\prime}=(V,E,\vec{c}^{\prime}), where E={u,v:u,vV,A{u,v}}E=\{\langle u,v\rangle:u,v\in V,A_{\{u,v\}}\neq\emptyset\}, and for any e=u,vEe=\langle u,v\rangle\in E, c(e)=aA{u,v}c(a)\vec{c}^{\prime}(e)=\sum_{a\in A_{\{u,v\}}}\vec{c}(a). Intuitively, GG^{\prime} is obtained from GG by neglecting the direction of the arcs and merging the capacities of twin arcs if any.

Theorem 12.

Di-LoMuF has a polynomial-time 2-approximation algorithm on symmetric di-trees.

Proof.

Arbitrarily fix a symmetric di-tree G=(V,A,c)G=(V,A,\vec{c}) and supply vectors d1,,dkV\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}_{-}^{V}. Let G=(V,E,c)G^{\prime}=(V,E,\vec{c}^{\prime}) be the induced graph of GG and arbitrarily orient the edges. Suppose v1,,vkv_{1},\cdots,v_{k} be the output of Algorithm 1 when the input is (G;d1,,dk)(G^{\prime};\vec{d}_{1},\cdots,\vec{d}_{k}). We set about to prove that v1,,vkv_{1},\cdots,v_{k} is a 2-approximate solution to Di-LoMuF on the instance (G;d1,,dk)(G;\vec{d}_{1},\cdots,\vec{d}_{k}).

Let λ=λ(G;d1,,dk)\lambda^{*}=\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k}) and λ=λ(G;d1,,dk)\lambda^{\prime*}=\lambda(G^{\prime};\vec{d}_{1},\cdots,\vec{d}_{k}). Our task is reduced to proving two claims.

Claim 1. λ(G;d1v1,,dkvk)λ2\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k})\geq\frac{\lambda^{\prime*}}{2}.

Let f1,,fkE\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}\in\mathbb{R}^{E} be a valid multi-commodity flow satisfying λd1v1,,λdkvk\lambda^{\prime*}\vec{d}_{1}\circ v_{1},\cdots,\lambda^{\prime*}\vec{d}_{k}\circ v_{k}, where λ=λ(G;d1,,dk)\lambda^{\prime*}=\lambda(G^{\prime};\vec{d}_{1},\cdots,\vec{d}_{k}).

For any 1ik1\leq i\leq k, define flow fi+A\vec{f}_{i}\in\mathbb{R}_{+}^{A} such that for any arc (u,v)A(u,v)\in A,

fi(u,v)={|fi(e)|2if either the orientation of e=u,v is from u to v and fi(e)>0or the orientation is from v to u and fi(e)<00otherwise.\vec{f}_{i}(u,v)=\begin{cases}\frac{|\vec{f}^{\prime}_{i}(e)|}{2}&\begin{array}[]{l}\textrm{if either the orientation of }e=\langle u,v\rangle\textrm{ is from }u\textrm{ to }v\textrm{ and }\vec{f}^{\prime}_{i}(e)>0\\ \textrm{or the orientation is from }v\textrm{ to }u\textrm{ and }\vec{f}^{\prime}_{i}(e)<0\end{array}\\ 0&\textrm{otherwise}\end{cases}.

It is straightforward to check that f1,,fk\vec{f}_{1},\cdots,\vec{f}_{k} is a valid multi-commodity flow on GG that satisfies λ2d1v1,,λ2dkvk\frac{\lambda^{\prime*}}{2}\vec{d}_{1}\circ v_{1},\cdots,\frac{\lambda^{\prime*}}{2}\vec{d}_{k}\circ v_{k}. Hence, Claim 1 holds.

Claim 2. λλ\lambda^{\prime*}\geq\lambda^{*}.

Let u1,,ukVu_{1},\cdots,u_{k}\in V be such that there is a valid multi-commodity flow f1,,fk+A\vec{f}_{1},\cdots,\vec{f}_{k}\in\mathbb{R}_{+}^{A} on GG which satisfies λd1u1,,λdkuk\lambda^{*}\vec{d}_{1}\circ u_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ u_{k}. For any 1ik1\leq i\leq k, define flow fiE\vec{f}^{\prime}_{i}\mathbb{R}^{E} on GG^{\prime} as follows: for any edge e=u,vEe=\langle u,v\rangle\in E, if it is oriented from uu to vv in GG^{\prime}, set fi(e)=fi(u,v)fi(v,u)\vec{f}^{\prime}_{i}(e)=\vec{f}_{i}(u,v)-\vec{f}_{i}(v,u). Roughly speaking, each fi\vec{f}^{\prime}_{i} is obtained from fi\vec{f}_{i} by merging traffics on twin arcs.

Again, it is easy to check that f1,,fk\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k} is a valid multi-commodity flow on GG^{\prime} that satisfies λd1u1,,λdkuk\lambda^{*}\vec{d}_{1}\circ u_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ u_{k}. This immediately leads to Claim 2.

Combining Claims 1 and 2, we have λ(G;d1v1,,dkvk)λ2\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k})\geq\frac{\lambda^{*}}{2}, which means that v1,,vkv_{1},\cdots,v_{k} is a 2-approximate solution to Di-LoMuF on the instance (G;d1,,dk)(G;\vec{d}_{1},\cdots,\vec{d}_{k}). ∎

Theorem 12 can be extended to general symmetric directed graphs. Recall the concept concentration defined in Remark 2.

Corollary 13.

Di-LoMuF has a polynomial-time 2max{η1,1}2\cdot\max\{\eta-1,1\}-approximation algorithm on symmetric directed graphs, where η\eta the concentration of the supply vectors.

Proof.

We follow the proof of Theorem 12. The only difference is that Algorithm 2 rather than Algorithm 1 is invoked. This modification is necessary, since Algorithm 1 is unfit for general undirected graphs.

The detailed proof is omitted. ∎

5 Other Variants of LoMuF

We continue to handle other variants of LoMuF, which are defined by extending LoMuF in three dimensions:

  1. 1.

    Different network models. In Section 4, we have thoroughly studied directed and undirected graphs. This section will consider the unsplittable flow model, which means that any flow from a source to a target is along one path. Such a flow model has been actively studied in the literature [17].

  2. 2.

    Different solution constraints. We restrict the targets to be chosen from a candidate set, rather than from the entire vertex set. This properly models the practical situation where applications can be deployed to prescribed servers. Such a restricted version of LoMuF is called restricted-LoMuF.

  3. 3.

    Different optimization goals. The network flow community typically serves three optimization goals: concurrent flow value which proportionately maximizes the flows, total flow value which maximizes the summation of all flows, and feasibility which maximize the number of feasible flows. Since concurrent flow value has been elaborated on in the previous sections, this section will investigate the latter two.

Now we begin to present some results of the variants.

Unsplittable flow: A flow is unsplittable if it can be decomposed into flow paths each of which corresponds to the flow from one source to the target and the correspondence is one-to-one. By a flow path, we mean a flow which has non-zero congestion only along a path, and we say that a flow path passes an edge if the flow has non-zero congestion on the edge.

Since on trees there is a unique path connecting any two vertices, flows on trees are intrinsically unsplittable. Consequently, by Theorem 5, even under the unsplittable flow model, LoMuF on trees is polynomial-time solvable. Actually, all the results in the previous sections remain true under the unsplittable flow model, since all the flows in the proofs are unsplittable. Moreover, stronger results can be obtained. See the following theorem as an example.

Theorem 14.

Under the unsplittable flow model, LoMuF is NP-hard and cannot be approximated within ratio 2 in polynomial time.

Proof.

Roughly speaking, we reduce 3-DM to LoMuF, and show that the solutions to LoMuF has a big gap of unsplittable flows indicating whether or not a perfect matching exists.

Basically, we follow the proof of Theorem 4. Given an instance (X,Y,Z,W)(X,Y,Z,W) of 3-DM with |X|=k|X|=k and |W|=l|W|=l, let G=(V,E,c)G=(V,E,\vec{c}) be the capacitated undirected graph as constructed in the proof of Theorem 4 (illustrated in Figure 2). We also adopt supply vectors di,1il\vec{d}_{i},1\leq i\leq l as in the proof of Theorem 4. For ease of reading, the vectors are redefined here. For any 1ik,k+1jl,vV1\leq i\leq k,k+1\leq j\leq l,v\in V,

di(v)={1if v{tX,tY,tZ}0otherwise\vec{d}_{i}(v)=\begin{cases}-1&\textrm{if }v\in\{t_{X},t_{Y},t_{Z}\}\\ 0&\textrm{otherwise}\end{cases}
dj(v)={1if v{tX,tY,tZ}0otherwise\vec{d}_{j}(v)=\begin{cases}-1&\textrm{if }v\in\{t^{\prime}_{X},t^{\prime}_{Y},t^{\prime}_{Z}\}\\ 0&\textrm{otherwise}\end{cases}

The rest of the proof consists of two parts.

Part 1: a perfect matching in WW implies λuf(G;d1,,dk)1\lambda_{uf}(G;\vec{d}_{1},\cdots,\vec{d}_{k})\geq 1, where the subscript ufuf indicates that the objective value is under the unsplittable flow model.

The proof is identical to the counterpart of the proof of Theorem 4, so omitted here.

Part 2: If WW contains no perfect matching, λ=λuf(G;d1,,dk)12\lambda^{*}=\lambda_{uf}(G;\vec{d}_{1},\cdots,\vec{d}_{k})\leq\frac{1}{2}.

Suppose the optimum targets are v1,,vlv_{1},\cdots,v_{l}, and the multi-commodity flow f1,,fl\vec{f}_{1},\cdots,\vec{f}_{l} is valid and satisfies d1v1,,dlvl\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{l}\circ v_{l}. Under the unsplittable flow model, for any 1il1\leq i\leq l, fi=fi,X+fi,Y+fi,Z\vec{f}_{i}=\vec{f}_{i,X}+\vec{f}_{i,Y}+\vec{f}_{i,Z} where fi,X\vec{f}_{i,X}, called a summand path of fi\vec{f}_{i}, is a flow path from tXt_{X} (or tXt^{\prime}_{X} when l>kl>k) to viv_{i}, and likewise for fi,Y,fi,Z\vec{f}_{i,Y},\vec{f}_{i,Z}.

Let EWE_{W} be the set of edges incident to vertices in WW. For 1il1\leq i\leq l, it is easy to observe two facts:

Fact 1

: If viWv_{i}\in W, each summand path of fi\vec{f}_{i} has non-zero congestion on at least one edge in EWE_{W}.

Fact 2

: If viWv_{i}\notin W, there are two summand paths of fi\vec{f}_{i} each having non-zero congestions on at least two edges in EWE_{W}.

Now we proceed case by case.

Case 1: viWv_{i}\notin W for some 1il1\leq i\leq l. By Facts 1 and 2, considering that there are 3l3l edges in EWE_{W} and 3l3l summand paths of all the flows, there must be an edge eEWe\in E_{W} shared by at least two flow paths. Since a flow path has congestion λ\lambda^{*} on any edge along it and each edge has capacity 1, we see that λ12\lambda^{*}\leq\frac{1}{2}.

Case 2: vi=vjWv_{i}=v_{j}\in W for some 1ijl1\leq i\neq j\leq l. fi\vec{f}_{i} and fj\vec{f}_{j} altogetger have six summand paths, each of which arrives viv_{i}. However, viv_{i} has only three incident edges, so at least two of the summand paths share an incident edge of viv_{i}. Again, since a flow path has congestion λ\lambda^{*} on any edge along it and each edge has capacity 1, we have λ12\lambda^{*}\leq\frac{1}{2}.

Case 3: viv_{i}’s lie in WW and are pairwise different. Assume without loss of generality that vi=wiv_{i}=w_{i}, for any 1il1\leq i\leq l. Because WW contains no perfect matching, there exist 1i,jk1\leq i,j\leq k such that wiwjw_{i}\bigcap w_{j}\neq\emptyset. Again without loss of generality, assume xXwiwjx\in X\bigcap w_{i}\bigcap w_{j}.

If fi,X\vec{f}_{i,X} does not pass the edge tX,x\langle t_{X},x\rangle, it must pass more than one edge in EWE_{W} before reaching wiw_{i}. Following the argument of Case 1, we see that λ12\lambda^{*}\leq\frac{1}{2}. Likewise, we have λ12\lambda^{*}\leq\frac{1}{2} if fj,X\vec{f}_{j,X} does not pass the edge tX,x\langle t_{X},x\rangle.

What’s remaining is when both fi,X\vec{f}_{i,X} and fj,X\vec{f}_{j,X} pass the edge tX,x\langle t_{X},x\rangle. One gets λ12\lambda^{*}\leq\frac{1}{2} due to the capacity constraint on this edge. ∎

Then we show that restricting targets (i.e., targets can be chosen only in a candidate set of vertices) substantially affects the hardness of target location problems. Since the unrestricted version is a special case of the restricted one, all the hardness results (including the lower bounds of the approximation ratios) remain valid. In fact, restricting targets may make the problems harder, which is confirmed below. Recall Theorem 5 which claims that LoMuF on trees is polynomial-time solvable. Nevertheless, with restricted targets, LoMuF on trees even has no PTAS.

Before going on, let’s recall a property of 3-DM. Let (X,Y,Z,W)(X,Y,Z,W) be an instance of 3-DM. For any uXYZu\in X\bigcup Y\bigcup Z, define its covering set to be ξ(u)={wW:uw}\xi(u)=\{w\in W:u\in w\}. It is known that (X,Y,Z,W)(X,Y,Z,W) remains NP-complete even on 3-covered instances, namely, maxuXYZ|ξ(u)|3\max_{u\in X\bigcup Y\bigcup Z}|\xi(u)|\leq 3 [8, page 221].

Theorem 15.

LoMuF with restricted targets is NP-hard on trees and cannot be approximated within ratio 76\frac{7}{6} in polynomial-time.

Proof.

We prove the theorem by reducing 3-DM to LoMuF.

Arbitrarily fix a 3-covered instance (X,Y,Z,W)(X,Y,Z,W) of 3-DM. Let k=|X|k=|X| and l=|W|l=|W| with W={w1,,wl}W=\{w_{1},\cdots,w_{l}\}. We will construct an instance of LoMuF with restricted targets, including a capacitated undirected graph G=(V,E,c)G=(V,E,\vec{c}), l+2kl+2k supply vectors, and a candidate set of vertices in which the targets of the supply vectors can be located.

Specifically, as illustrated in Figure 6, GG is a tree consisting of a root rr and the set WW of leaves, and the capacity of each edge is 6. All the edges are oriented from leaves to the root.

Refer to caption
Figure 6: The tree for reducing 3-DM.

Let U=XYZU=X\bigcup Y\bigcup Z. For any uUu\in U, define a supply vector du\vec{d}_{u} such that for any vVv\in V,

du(v)={1if uvW|ξ(u)|3if v=r0otherwise.\vec{d}_{u}(v)=\begin{cases}-1&\textrm{if }u\in v\in W\\ |\xi(u)|-3&\textrm{if }v=r\\ 0&\textrm{otherwise}\end{cases}.

For any 1ilk1\leq i\leq l-k, define a supply vector di\vec{d}_{i} such that for any vVv\in V,

di(v)={3if v=r0otherwise.\vec{d}_{i}(v)=\begin{cases}-3&\textrm{if }v=r\\ 0&\textrm{otherwise}\end{cases}.

Let the candidate set be WW, i.e., we are not allowed to choose the root as targets.

The rest of the proof consists of two parts.

Part 1: If WW contains a perfect matching, λW(G;d1,,dl+2k)1\lambda_{W}(G;\vec{d}_{1},\cdots,\vec{d}_{l+2k})\geq 1, where the subscript WW indicates that the candidate set for the targets is WW.

Without loss of generality, suppose WWW^{\prime}\subseteq W is a perfect matching. Let ϕ:U{1,,l}\phi:U\rightarrow\{1,\cdots,l\} be the mapping such that uwϕ(u)Wu\in w_{\phi(u)}\in W^{\prime} for any uUu\in U. For any uUu\in U, define flow fu\vec{f}_{u} such that for any edge e=r,we=\langle r,w\rangle,

fu(e)={2if w=wϕ(u)1if wξ(u){wϕ(u)}0otherwise.\vec{f}_{u}(e)=\begin{cases}-2&\textrm{if }w=w_{\phi(u)}\\ 1&\textrm{if }w\in\xi(u)\setminus\{w_{\phi(u)}\}\\ 0&\textrm{otherwise}\end{cases}.

One can check that fu\vec{f}_{u} satisfies the demand vector duwϕ(u)\vec{d}_{u}\circ w_{\phi(u)}.

Then arbitrarily fix a bijective mapping ψ:{1,,lk}{1,,l}ϕ(U)\psi:\{1,\cdots,l-k\}\rightarrow\{1,\cdots,l\}\setminus\phi(U). For any 1ilk1\leq i\leq l-k, define flow fi\vec{f}_{i} such that for any edge e=r,we=\langle r,w\rangle,

fi(e)={3if w=wψ(i)0otherwise.\vec{f}_{i}(e)=\begin{cases}-3&\textrm{if }w=w_{\psi(i)}\\ 0&\textrm{otherwise}\end{cases}.

One can check that fi\vec{f}_{i} satisfies the demand vector diwψ(i)\vec{d}_{i}\circ w_{\psi(i)}.

Furthermore, it is easy to see that the l+2kl+2k flows form a valid multi-commodity flow. Hence we finishes the proof of Part 1.

Part 2: If WW contains no perfect matching, λ=λW(G;d1,,dl+2k)67\lambda^{*}=\lambda_{W}(G;\vec{d}_{1},\cdots,\vec{d}_{l+2k})\leq\frac{6}{7}.

Let vuWv_{u}\in W for uUu\in U, viWv_{i}\in W for 1imn1\leq i\leq m-n be such that there is a valid multi-commodity flow fu\vec{f}_{u} for uUu\in U, fi\vec{f}_{i} for 1imn1\leq i\leq m-n satisfying λduvu\lambda^{*}\vec{d}_{u}\circ v_{u} for uUu\in U, λdivi\lambda^{*}\vec{d}_{i}\circ v_{i} for 1imn1\leq i\leq m-n.

First of all, for any edge e=r,we=\langle r,w\rangle, we can observe two facts:

uU|fu(e)|(3+n+3n)λ\displaystyle\sum_{u\in U}|\vec{f}_{u}(e)|\geq(3+n+3n^{\prime})\lambda^{*} (6)

where n=|{u:uw,vu=w}|n=|\{u:u\in w,v_{u}=w\}| and n=|{u:uw,vu=w}|n^{\prime}=|\{u:u\notin w,v_{u}=w\}|, and

1ilk|fi(e)|3mλ\displaystyle\sum_{1\leq i\leq l-k}|\vec{f}_{i}(e)|\geq 3m\lambda^{*} (7)

where m=|{i:1ilk,vi=w}|m=|\{i:1\leq i\leq l-k,v_{i}=w\}|. The detailed proof is omitted since the inequalities are immediate results of applying Lemma 3 to Cut({w})Cut(\{w\}).

Then we proceed case by case.

Case 1: vi=vuv_{i}=v_{u} for some 1imn,uU1\leq i\leq m-n,u\in U.

Let e=r,vie=\langle r,v_{i}\rangle. By (6) and (7), the total congestion on edge ee satisfies uU|fu(e)|+1ilk|fi(e)|7λ\sum_{u\in U}|\vec{f}_{u}(e)|+\sum_{1\leq i\leq l-k}|\vec{f}_{i}(e)|\geq 7\lambda^{*}. By capacity constraint on ee, we have λ67\lambda^{*}\leq\frac{6}{7}.

Case 2: vi=vjv_{i}=v_{j} for some 1ijlk1\leq i\neq j\leq l-k.

Let e=r,vie=\langle r,v_{i}\rangle. By (6) and (7), the total congestion on edge ee satisfies uU|fu(e)|+1ilk|fi(e)|9λ\sum_{u\in U}|\vec{f}_{u}(e)|+\sum_{1\leq i\leq l-k}|\vec{f}_{i}(e)|\geq 9\lambda^{*}. By capacity constraint on ee, we have λ23\lambda^{*}\leq\frac{2}{3}.

Case 3: there exists wWw\in W such that |{uU:vu=w}|4|\{u\in U:v_{u}=w\}|\geq 4.

Let e=r,we=\langle r,w\rangle. By (6), uU|fu(e)||7λ\sum_{u\in U}|\vec{f}_{u}(e)||\geq 7\lambda^{*}. By capacity constraint on ee, we have λ67\lambda^{*}\leq\frac{6}{7}.

The rest of the proof will assume that none of the Cases 1-3 happens. Let W={wW:vu=w for some uU}W^{\prime}=\{w\in W:v_{u}=w\text{ for some }u\in U\} and W′′={wW:vi=w for some 1ijlk}W^{\prime\prime}=\{w\in W:v_{i}=w\text{ for some }1\leq i\neq j\leq l-k\}. We have

WW′′=,|W′′|=lk,|W|k.\displaystyle W^{\prime}\bigcap W^{\prime\prime}=\emptyset,|W^{\prime\prime}|=l-k,|W^{\prime}|\leq k. (8)

By the pigeon hole principle, one further sees that for any wWw\in W^{\prime}, |{uU:vu=w}|=3|\{u\in U:v_{u}=w\}|=3.

Case 4: there exists uUu\in U such that uvuu\notin v_{u}.

Let e=r,vue=\langle r,v_{u}\rangle. Since |{uU:vu=vu}|=3|\{u^{\prime}\in U:v_{u^{\prime}}=v_{u}\}|=3 and uvuu\notin v_{u}, by (6), uU|fu(e)||8λ\sum_{u\in U}|\vec{f}_{u}(e)||\geq 8\lambda^{*}. By capacity constraint on ee, we have λ34\lambda^{*}\leq\frac{3}{4}.

Case 5: None of the above cases happens.

Since uvuu\in v_{u} for any uUu\in U, |U||wWw||U|\leq|\bigcup_{w\in W^{\prime}}w| which is at most 3k3k due to (8). Recall that |U|=3k|U|=3k, so |wWw|=3k|\bigcup_{w\in W^{\prime}}w|=3k. As a result, ww=w\bigcap w^{\prime}=\emptyset for any w,wWw,w^{\prime}\in W^{\prime}, which implies that WW^{\prime} is a perfect matching, contradictory to the assumption that WW contains no perfect matching. Therefore, Case 5 never happens.

To sum up all the cases, λ67\lambda^{*}\leq\frac{6}{7}. The proof ends. ∎

The following theorem is also a surprise. In the unrestricted case, if all the supply vectors are uni-source (i.e., each having a single source), a trivial optimum solution to LoMuF is choosing the sources themself as targets. However, when targets are restricted to a prescribed sets, LoMuF becomes NP-hard even on uni-source supply vectors and stars (i.e., trees of depth 11).

Theorem 16.

LoMuF with restricted targets is NP-hard on uni-source supply vectors and stars.

Proof.

We prove the theorem via a reduction from 3-partition problem to LoMuF. For this end, given an instance S={s1,,s3m}S=\{s_{1},\cdots,s_{3m}\} of 3-partition problem, we set about to construct an instance of LoMuF with restricted targets, including a capacitated star, 3m3m supply vectors, and a candidate set of targets.

Refer to caption
Figure 7: The star for reducing 3-partition problem.

Specifically, as illustrated in Figure 7, the capacitated undirected star G=(V,E,c)G=(V,E,\vec{c}) consists of the center rr and the set UU of mm leaves u1,,umu_{1},\cdots,u_{m}. Orient every edge to point to rr. Each edge has capacity BB, where B=sSsmB=\frac{\sum_{s\in S}s}{m}. For any 1i3m1\leq i\leq 3m, define supply vector diV\vec{d}_{i}\in\mathbb{R}_{-}^{V} such that di(r)=si\vec{d}_{i}(r)=-s_{i} and di(u)=0\vec{d}_{i}(u)=0 for any uUu\in U. Appoint UU to be the candidate set of targets.

Our proof will be done in two steps.

Step 1. If SS has an equi-partition, then λU(G;d1,,d3m)1\lambda_{U}(G;\vec{d}_{1},\cdots,\vec{d}_{3m})\geq 1.

Let {S1,,Sm}\{S_{1},\cdots,S_{m}\} be an equi-partition of SS. For any 1i3m1\leq i\leq 3m, let 1jm1\leq j\leq m satisfy siSjs_{i}\in S_{j}, and we define flow fi\vec{f}_{i} such that for any eEe\in E,

fi(e)={siif e=r,uj0otherwise.\vec{f}_{i}(e)=\begin{cases}-s_{i}&\textrm{if }e=\langle r,u_{j}\rangle\\ 0&\textrm{otherwise}\end{cases}.

One can check that fi\vec{f}_{i} satisfies demand vector diuj\vec{d}_{i}\circ u_{j}.

Since {S1,,Sm}\{S_{1},\cdots,S_{m}\} is an equi-partition of SS, all these flows form a valid multi-commodity flow. Hence, we have λU(G;d1,,d3m)1\lambda_{U}(G;\vec{d}_{1},\cdots,\vec{d}_{3m})\geq 1.

Step 2. If λU(G;d1,,d3m)1\lambda_{U}(G;\vec{d}_{1},\cdots,\vec{d}_{3m})\geq 1, SS has an equi-partition.

Let v1,,v3mUv_{1},\cdots,v_{3m}\in U be such that there is a valid multi-commodity flow {f1,,f3m}\{\vec{f}_{1},\cdots,\vec{f}_{3m}\} satisfying d1v1,,d3mv3m\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{3m}\circ v_{3m}. For any 1jm1\leq j\leq m, let Ij={1i3m:vi=uj}I_{j}=\{1\leq i\leq 3m:v_{i}=u_{j}\}. Applying Lemma 3, we have |fi(r,uj)|si|\vec{f}_{i}(\langle r,u_{j}\rangle)|\geq s_{i} for any iIji\in I_{j}. Due to the capacity constraint, one gets BiIj|fi(r,uj)|B\geq\sum_{i\in I_{j}}|\vec{f}_{i}(\langle r,u_{j}\rangle)|. As a result, mB1jmiIj|fi(r,uj)|sSs=mBmB\geq\sum_{1\leq j\leq m}\sum_{i\in I_{j}}|\vec{f}_{i}(\langle r,u_{j}\rangle)|\geq\sum_{s\in S}s=mB, meaning that B=iIj|fi(r,uj)|B=\sum_{i\in I_{j}}|\vec{f}_{i}(\langle r,u_{j}\rangle)| for any 1jm1\leq j\leq m. Hence, {Sj={si:iIj}:1jm}\{S_{j}=\{s_{i}:i\in I_{j}\}:1\leq j\leq m\} is an equi-partition of SS. ∎

Then we discuss the target location version of the maximum multi-commodity problem. Arbitrarily fix supply vectors diV,iI\vec{d}_{i}\in\mathbb{R}_{-}^{V},i\in I on a capacitated directed/undirected graph with vertex set VV, where II is a finite index set. Roughly speaking, we are to locate targets for the supply vectors so as to maximize the total flow values. In particular, we have to find vi,iIv_{i},i\in I to maximize iIλidi1\sum_{i\in I}\lambda_{i}\|\vec{d}_{i}\|_{1}, where non-negative reals λi\lambda_{i}’s are such that

  1. 1.

    For any iIi\in I, there exists a flow fi\vec{f}_{i} satisfying the demand vector λidivi\lambda_{i}\vec{d}_{i}\circ v_{i}, and

  2. 2.

    {fi:iI}\{\vec{f}_{i}:i\in I\} is a valid multi-commodity flow.

It is worth noting that all the preceding results in this paper still hold (and the proofs are also valid), except that we are not sure whether the lower bounds of approximation ratio remain true.

Finally, we investigate the target location version of the maximum feasibility problem (maxf-LoMuF for short). Intuitively, our goal is to locate the targets so as to maximize the number of satisfiable supply vectors. Formally, given a set SS of demand vectors on a capacitated network GG, its feasibility ζ(G;S)\zeta(G;S) is defined to be the maximum subset of SS that can be simultaneously satisfied, namely, ζ(G;S)=maxSS,λ(G;S)1|S|\zeta(G;S)=\max_{S^{\prime}\subseteq S,\lambda(G;S^{\prime})\geq 1}|S^{\prime}|. Given supply vectors diV,iI\vec{d}_{i}\in\mathbb{R}_{-}^{V},i\in I on a capacitated directed/undirected graph GG with vertex set VV, the task of maxf-LoMuF is to find v1V,iIv_{1}\in V,i\in I so as to maximize ζ(G;divi,iI)\zeta(G;\vec{d}_{i}\circ v_{i},i\in I). By abusing notation, the optimum objective value will also be denoted by ζ(G;di,iI)\zeta(G;\vec{d}_{i},i\in I).

We will show that maxf-LoMuF is hard to approximate. The proof relies on a reduction from the well-studied maximum independent set problem (MIS) which aims to find a maximum set of vertices that are pairwise non-adjacent in a given graph. Let’s first recall a property of MIS.

Lemma 17 ([11]).

For any constant ϵ>0\epsilon>0, unless NP=ZPP, MIS can not be approximated within O(n1ϵ)O(n^{1-\epsilon}) on graphs of nn vertices for any constant ϵ>0\epsilon>0.

Theorem 18.

For any constant ϵ>0\epsilon>0, unless NP=ZPP, the maxf-LoMuF problem on kk supply vectors cannot be approximated within O(k1ϵ)O(k^{1-\epsilon}) in polynomial-time.

Proof.

We prove by reducing MIS to maxf-LoMuF. Namely, given a graph G=(V,E)G=(V,E), we will construct a capacitated graph G=(V,E,c)G^{\prime}=(V^{\prime},E^{\prime},\vec{c}) and supply vectors d1,,dkV\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}_{-}^{V^{\prime}}, where k=|V|k=|V|.

Specifically, V=VEWV^{\prime}=V\bigcup E\bigcup W, where W={we:eE}W=\{w_{e}:e\in E\}. E=E1E2E^{\prime}=E^{\prime}_{1}\bigcup E^{\prime}_{2}, where E1={v,e:vV,eE,v is an end of e}E^{\prime}_{1}=\{\langle v,e\rangle:v\in V,e\in E,v\textrm{ is an end of }e\} and E2={e,we:eE}E^{\prime}_{2}=\{\langle e,w_{e}\rangle:e\in E\}. Every edge of GG^{\prime} has capacity 1. The graph GG^{\prime} is illustrated in Figure 8. We choose to orient every edge upward.

Refer to caption
Figure 8: The capacitated graph to which a MIS instance is reduced.

For any 1ik1\leq i\leq k, define a supply vector di\vec{d}_{i} such that for any vVv\in V^{\prime},

di(v)={1if v{vi}{we:e is incident to vi}0otherwise.\vec{d}_{i}(v)=\begin{cases}-1&\textrm{if }v\in\{v_{i}\}\bigcup\{w_{e}:e\textrm{ is incident to }v_{i}\}\\ 0&\textrm{otherwise}\end{cases}.

Arbitrarily fix a subset I{1,,k}I\subseteq\{1,\cdots,k\}. We prove two claims:

  • Claim 1: If {vi:iI}\{v_{i}:i\in I\} is an independent set of GG, then λ(G;di,iI)1\lambda(G^{\prime};\vec{d}_{i},i\in I)\geq 1.

    Suppose {vi:iI}\{v_{i}:i\in I\} is an independent set of GG. For any iIi\in I, define flow fi\vec{f}_{i} such that for any e=u,eEe^{\prime}=\langle u,e\rangle\in E^{\prime} with eEe\in E,

    fi(e)={1if u{vi,we},e is incident to vi in G0otherwise.\vec{f}_{i}(e^{\prime})=\begin{cases}1&\textrm{if }u\in\{v_{i},w_{e}\},e\textrm{ is incident to }v_{i}\textrm{ in }G\\ 0&\textrm{otherwise}\end{cases}.

    It is easy to check that {fi:iI}\{\vec{f}_{i}:i\in I\} is a valid multi-commodity flow satisfying {divi:iI}\{\vec{d}_{i}\circ v_{i}:i\in I\}. Hence, λ(G;di,iI)1\lambda(G^{\prime};\vec{d}_{i},i\in I)\geq 1.

  • Claim 2: If λ(G;di:iI)1\lambda(G^{\prime};\vec{d}_{i}:i\in I)\geq 1, then {vi:iI}\{v_{i}:i\in I\} is an independent set of GG.

    Assume λ(G;di:iI)1\lambda(G^{\prime};\vec{d}_{i}:i\in I)\geq 1. Choose uiV,iIu_{i}\in V^{\prime},i\in I such that there is a valid multi-commodity flow {fi:iI}\{\vec{f}_{i}:i\in I\} which satisfies {diui:iI}\{\vec{d}_{i}\circ u_{i}:i\in I\}.

    We set about to show that {v1,,vy}\{v_{1},\cdots,v_{y}\} is an independent set of GG. For contradiction, suppose iiIi\neq i^{\prime}\in I are such that viv_{i} and viv_{i}^{\prime} are both incident to eEe\in E in GG. By Lemma 3, no matter where uiu_{i} lies, we always have |fi(e,we)|1|\vec{f}_{i}(\langle e,w_{e}\rangle)|\geq 1. Likewise, we also have |fi(e,we)|1|\vec{f}_{i^{\prime}}(\langle e,w_{e}\rangle)|\geq 1. Considering that c(e,we)=1\vec{c}(\langle e,w_{e}\rangle)=1, we reach a contradiction. Claim 2 holds.

By Claims 1 and 2, for any α\alpha-approximate solution to the instance of maxf-LoMuF, we can construct an α\alpha-approximate solution to the instance of MIS, and vice versa. Then the theorem holds due to Lemma 17. ∎

We have a trivial approximation algorithm for maxf-LoMuF: Given a capacitated graph GG and kk supply vectors di,1ik\vec{d}_{i},1\leq i\leq k, by enumerating, find the first 1ik1\leq i\leq k and vVv\in V such that λ(G;div)1\lambda(G;\vec{d}_{i}\circ v)\geq 1. This algorithm obviously has approximation ratio kk, which is nearly optimum due to Theorem 18.

Remark 5.

The above results (including the algorithm) about maxf-LoMuF can be extended to directed graphs. The proofs remain valid up to minor modifications, so detailed proof are omitted here.

6 Conclusion

We formulated the target location problem for multi-commodity flows. It is a natural combination of the classic facility location problem and the multi-commodity flow problem, and extends both. It is interesting in theory and well-rooted in real-world applications.

We mainly study the issue of maximizing concurrent flows, both on directed and undirected networks. It is interesting to see that the directed case makes the problem harder: the problem is efficiently solvable on undirected trees, but NP-hard on di-paths. Another separation is that the problem is efficiently solvable for bi-source supply vectors on undirected graphs, while it is NP-hard for such supply vectors on directed graphs. We have also made progress on algorithm design: in addition to an exact algorithm on trees, an approximation algorithm is proposed for arbitrary undirected graphs, which leads to algorithms on symmetric directed graphs.

As the first step towards this novel direction, there remain numerous open questions. Just mention a few.

  1. 1.

    Though an η\eta-approximation algorithm exists on undirected networks, we know nothing about the lower bound of approximation ratio of the problem. Even whether a PTAS exists remains open. The directed situation is less satisfactory: except a trivial algorithm with approximation ratio kk, no non-trivial approximation algorithm on general directed graphs is known.

  2. 2.

    The variants deserve further studying. Since in many applications, targets can be chosen only from a candidate set, restricted version of our problem is of special interest. Cost minimization is an active topic in classic network flow problems. It can be easily defined in our framework, and is a rich research direction. One more variant has not yet been mentioned: This paper allows choosing just one target for each commodity, but what if more targets can be selected?

  3. 3.

    Online versions of our problem are also well motivated. Recall the scenario of geo-distributed data analysis. The typical case is that the applications arrive sequentially in an online fashion, rather than all at once as we mentioned before. The online fashion poses special challenges in algorithm design. We are even not sure whether an algorithm exists with guaranteed competitive ratio.

Acknowledgement

We are grateful to the anonymous referees for detailed corrections and suggestions. We also thank Prof. Yungang Bao for his encouragement and support. Special thanks go to Dr. Laiping Zhao, who helped the authors formulating the problem.

References

  • [1] H. An, M. Singh, and O. Svensson. Lp-based algorithms for capacitated facility location. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pages 256–265, 2014.
  • [2] K. Andreev, C. Garrod, D. Golovin, B. M. Maggs, and A. Meyerson. Simultaneous source location. Acm Transactions on Algorithms, 6(1):Article No. 16, 2009.
  • [3] K. Arata, S. Iwata, K. Makino, and S. Fujishige. Locating sources to meet flow demands in undirected networks. Journal of Algorithms, 42(1):54–68, 2002.
  • [4] A. BernáTh. Source location in undirected and directed hypergraphs. Oper. Res. Lett., 36(3):355–360, 2008.
  • [5] C. Chekuri, S. Khanna, and F. B. Shepherd. The all-or-nothing multicommodity flow problem. SIAM Journal on Computing, 42(4):1467–1493, 2013.
  • [6] P. Christiano, J. Kelner, A. Madry, D. Spielman, and S.-H. Teng. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. Proceedings of the Annual ACM Symposium on Theory of Computing, page 273–282, June 2011.
  • [7] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms. MIT press, 2009.
  • [8] M. R. Gary and D. S. Johnson. Computers and intractability: A guide to the theory of np-completeness, 1979.
  • [9] S. Guha and S. Khuller. Greedy strikes back: Improved facility location algorithms. Journal of Algorithms, 31(1):228–248.
  • [10] S. L. Hakimi. Optimum locations of switching centers and the absolute centers and medians of a graph. Operations Research, 12(3):450–459, june 1964.
  • [11] J. Hastad. Clique is hard to approximate within n1ϵn^{1-\epsilon}. Acta Mathematica, 182(1):105–142, 1996.
  • [12] P. Hebler and H. W. Hamacher. Sink location to find optimal shelters in evacuation planning. EURO Journal on Computational Optimization, 4(3):325–347, 2016.
  • [13] H. Ito, M. Paterson, and K. Sugihara. The multi-commodity source location problems and the price of greed. Journal of Graph Algorithms and Applications, 13(1):55–73, 2009.
  • [14] J. A. Kelner, Y. T. Lee, L. Orecchia, and A. Sidford. An almost-linear-time algorithm for approximate max flow in undirected graphs and its multicommodity generalizations. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), page 217–226, USA, 2014. Society for Industrial and Applied Mathematics.
  • [15] J. A. Kelner, G. L. Miller, and R. Peng. Faster approximate multicommodity flow using quadratically coupled flows. In Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing, page 1–18, New York, NY, USA, 2012. Association for Computing Machinery.
  • [16] S. Khuller. Data-Aware Scheduling in Datacenters. PhD thesis, University of Maryland (College Park, Md.), 2016.
  • [17] P. Kolman and C. Scheideler. Improved bounds for the unsplittable flow problem. In Thirteenth Acm-siam Symposium on Discrete Algorithms, 2002.
  • [18] G. Kortsarz and Z. Nutov. A note on two source location problems. Journal of Discrete Algorithms, 6(3):520–525, sep 2008.
  • [19] R. Krauthgamer, J. R. Lee, and H. Rika. Flow-cut gaps and face covers in planar graphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 525–534. SIAM, 2019.
  • [20] M. Labbe, D. Peeters, and J.-F. Thisse. Location on networks. Network Routing, 8, 02 1992.
  • [21] Y. T. Lee, S. Rao, and N. Srivastava. A new approach to computing maximum flows using electrical flows. In Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing, page 755–764, 2013.
  • [22] S. Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. In Automata, Languages & Programming-international Colloquium, Icalp, Zurich, Switzerland, July, Part II, pages 45–58.
  • [23] A. Madry. Computing maximum flow with augmenting electrical flows. In Proceedings of the IEEE 57th Annual Symposium on Foundations of Computer Science, pages 593–602, 2016.
  • [24] S. Mamada, K. Makino, and S. Fujishige. Optimal sink location problem for dynamic flows in a tree network. Ieice Transactions on Fundamentals of Electronics Communications and Computer Sciences, E85-A(5):1020–1025, 2002.
  • [25] S. Mamada, T. Uno, K. Makino, and S. Fujishige. An algorithm for the optimal sink location problem in dynamic tree networks. Discrete Applied Mathematics, 154(16):2387–2401, 2006.
  • [26] L. Monis, B. Kunjumon, and N. Guruprasad. Implementation of Maximum Flow Algorithm in an Undirected Network, pages 195–202. Springer Singapore, Singapore, 01 2019.
  • [27] M. T. Y. J. Naga, U. S., N. A., and G. S. A technical survey on optimization of processing geo-distributed data. Journal of Physics: Conference Series, 1000:012140, 2018.
  • [28] R. Peng. Approximate undirected maximum flows in o(mpolylog(n)) time. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, page 1862–1867, 2016.
  • [29] Q. Pu, G. Ananthanarayanan, P. Bodik, S. Kandula, A. Akella, P. Bahl, and I. Stoica. Low latency geo-distributed data analytics. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM ’15, page 421–434, 2015.
  • [30] M. Sakashita, K. Makino, and S. Fujishige. Minimum cost source location problems with flow requirements. Algorithmica, 50(4):555–583, 2006.
  • [31] A. Salmasi, A. Sidiropoulos, and V. Sridhar. On constant multi-commodity flow-cut gaps for families of directed minor-free graphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 535–553. SIAM, 2019.
  • [32] F. B. Shepherd, A. Vetta, and G. T. Wilfong. Polylogarithmic approximations for the capacitated single-sink confluent flow problem. In Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science, pages 748–758, 2015.
  • [33] J. Sherman. Area-convexity, ll_{\infty} regularization, and undirected multicommodity flow. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 452–460, 2017.
  • [34] D. B. Shmoys, va Tardos, and K. Aardal. Approximation algorithms for facility location problems (extended abstract). In Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, El Paso, Texas, USA, May 4-6, 1997, 1997.
  • [35] H. Tamura, M. Sengoku, S. Shinoda, and T. Abe. Location problems on undirected flow networks. IEICE Trans., E73:1989–1993, 1990.
  • [36] B. C. Tansel, R. L. Francis, T. J. Lowe, M. Science, and N. Apr. Location on networks : A survey . part ii : Exploiting tree network structure. Management Science, 29(4):498–511, 1983.
  • [37] V. V. Vazirani. Approximation Algorithms. Springer, 2001.
  • [38] L. Yin, J. Sun, L. Zhao, C. Cui, J. Xiao, and C. Yu. Joint scheduling of data and computation in geo-distributed cloud systems. In 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pages 657–666, 2015.
  • [39] L. Yue, L. Zhao, C. Cui, and C. Yu. Fast big data analysis in geo-distributed cloud. In IEEE International Conference on Cluster Computing, 2016.
  • [40] L. Zhao, Y. Yang, A. Munir, A. X. Liu, Y. Li, and W. Qu. Optimizing geo-distributed data analytics with coordinated task scheduling and routing. IEEE Transactions on Parallel and Distributed Systems, 31(2):279–293, 2020.