Target Location Problem for Multi-commodity Flow

Xingwu Liu Institute of Computing Technology, Chinese Academy of Sciences. University of Chinese Academy of Sciences. Beijing, China. Email:[email protected]. Zhida Pan ¹¹1Corresponding author Institute of Computing Technology, Chinese Academy of Sciences. University of Chinese Academy of Sciences. Beijing, China. Email:[email protected]. Yuyi Wang ETH Zurich, Switzerland. Email:[email protected].

(April 15, 2020)

Abstract

Motivated by scheduling in Geo-distributed data analysis, we propose a target location problem for multi-commodity flow (LoMuF for short). Given commodities to be sent from their resources, LoMuF aims at locating their targets so that the multi-commodity flow is optimized in some sense. LoMuF is a combination of two fundamental problems, namely, the facility location problem and the network flow problem. We study the hardness and algorithmic issues of the problem in various settings. The findings lie in three aspects. First, a series of NP-hardness and APX-hardness results are obtained, uncovering the inherent difficulty in solving this problem. Second, we propose an approximation algorithm for general undirected networks and an exact algorithm for undirected trees, which naturally induce efficient approximation algorithms on directed networks. Third, we observe separations between directed networks and undirected ones, indicating that imposing direction on edges makes the problem strictly harder. These results show the richness of the problem and pave the way to further studies.

1 Introduction

Nowadays, data is generated geo-distributively at a much higher speed as compared to the existing data transfer speed; for instance, telescopes around the world bring us an unimaginable amount of astronomy data. There are two main reasons for having geo-distributed data: (1) Datacenters (DCs) are built across the globe. (2) Organizations prefer to use multiple clouds to increase reliability, security, and processing. Besides, there exist applications that process and analyze a huge amount of massively geo-distributed data to extract useful information. A typical scenario in processing geo-distributed data is that several analysis tasks are running simultaneously, and each requires a fraction of the collected data [27, 29, 39, 38, 16]. In addition, every analysis task moves needed data to a single location before the computation. Fig. 1 shows an example of geo-distributed telescope data.

The network bandwidth is a crucial factor in geo-distributed data movement and becomes the resource bottleneck. For example, the demand for bandwidth increased from 60 to 290 Tbps between the years 2011 and 2015 while the network capacity growth was not proportional. In 2015, the network capacity growth was only 40 percent, which was the lowest during the years 2011 and 2014²²2https://www.telegeography.com/researchservices/global-bandwidth-research-service/. When applications (such as electromagnetic radiation and infrared ray analysis), each handling data from some datacenters, have to be deployed, there is no meaningful notion of distance. The latency (travel time of a single small packet) under low-congestion conditions tends not to be noticeable to the end-users. The real difficulty here is the underlying capacity of the network. If links become congested, then the latency will increase and throughput will suffer. A key issue is how to allocate enough bandwidth to each application without causing congestion on the network [40].

Hence, we need to choose proper locations for the tasks to reduce congestion. Specifically, we propose this target location problem for multi-commodity flow: Given sources of multiple commodities on a capacitated network, the goal is to locate the targets to maximize the flow value.

We fix sources because, as our motivating example of geo-distributed data analysis shows, it is difficult, if not impossible, to change datacenters that collect data since for efficiency as these datacenters should be close to data generators. However, it is much more flexible to choose the target locations where the analysis tasks are performed.

The multi-commodity flow problem (MCF) is one of the most fundamental problems with a wide variety of scientific and engineering applications that have been studied intensively [32, 26]. In the most typical scenario, a finite number of commodities have to be sent from their sources to targets on a capacitated network. Each commodity has its own flow, and the commodities interact when their flows compete for capacity on common edges.

There are two general classes of MCF. One is network analysis which, based on a given network configuration, finds the optimal flow pattern for some objective function. The most studied objective functions include maximizing flow values and minimizing flow costs. The other belongs to network synthesis which seeks an optimal network configuration satisfying certain requirements.

In both classes, the targets of the commodities are taken for granted. To our surprise, researches have long neglected how targets are chosen. This paper is devoted to initializing such a theory.

Refer to caption — Figure 1: An overview of geo-distributed astronomy data and corresponding tasks

Our proposed problem extends the MCF framework. It does not belong to either of the two classes. It is a combination of the facility location problem and the network flow problem which are inherently related [1]. Facility location is a branch of operations research related to locating or positioning at least a new facility among several existing facilities to optimize (minimize or maximize) at least one objective function. It is among the most fundamental problems in operations research and theoretical computer science [37]. Facility location met network flow in 1990 [35] and has inspired a series of work [13, 3, 18, 2, 12]. However, all the published works minimize costs of the selected sources or targets (not the flow cost which, together with flow value, is the objective of network flow problems), and never consider the multi-commodity setting. The most crucial difference lies in that our combination is inherent, meaning that the objective is to optimize the flow value, but the literature focuses on the cost of the selected nodes rather than the cost of flow. Another benefit of our framework is that it can naturally extend almost every network flow problem, e.g., flow cost minimization.

Our model has other applications, such as Web server deployment. There are serving various demands from widely-distributed users, for example, requesting for different online video, where should the servers be located so that the users have a good experience? Again we do not care distance, and the key objective is to optimize the available bandwidth. There are more motivating examples, say, network-flow based evacuation planning for an emergency where shelters have to be selected, and congestion decides the efficiency of evacuation. Interested readers are referred to [12].

These real-world scenarios justify our problem’s critical features: The commodities are only partially determined since the targets are not given and have themselves to be optimized, and a decisive factor of the optimization is the bandwidth rather than any notion of distance. This well motivates our problem.

1.1 Results and Discussion

We propose a novel model of the target location for multi-commodity flow (LoMuF). On the one hand, we figure out the hardness results of various versions. On the other hand, we design algorithms for several versions. The results are as follows.

1.

We show that the LoMuF problem is NP-hard on general undirected graphs.

We know that if the targets are fixed, the problem degenerates to one normal multi-commodity flow problem (allowing fractional flows) and becomes tractable in polynomial time, which shows that the most challenging part of this problem is indeed how to locate the targets.
2.

We design a polynomial-time algorithm solving LoMuF on trees.

Trees are important network structures in practice. Compared to the NP-hardness result above, the fact that there is only one path connecting a source and the target on a tree simplifies the problem. Our algorithm is elegant and surprisingly shows that the interaction between different commodities becomes not harmful on trees.
3.

We present a $\max\{\theta-1,1\}$ -approximation algorithm for LoMuF on general undirected graphs, where $\theta$ is the largest source number among all commodities.

This result actually shows that, when $\theta\leq 2$ (the so-called bi-source cases) the problem can also be solved efficiently, but (take into account the NP-hardness result above) becomes intractable when $\theta\geq 3$ .
4.

For LoMuF on directed graphs (Di-LoMuF), we prove that it is also NP-hard and even cannot be efficiently approximated with a ratio less than 2.

In fact, we also show that LoMuF on undirected graphs can be reduced to the directed case Di-LoMuF, and then Di-LoMuF should be even harder.
5.

Di-LoMuF also remains NP-hard on symmetric di-paths and bi-source supply vectors.

These are clear separations between undirected LoMuF and Di-LoMuF, since undirected LoMuF is efficiently solvable on trees while Di-LoMuF is even difficult on paths, a very special case of trees. As we pointed out above, the bi-source instances are easy for undirected LoMuF, but not for Di-LoMuF.
6.

For the special case on symmetric di-trees, Di-LoMuF has a polynomial-time 2-approximation algorithm.

Though we have seen several hardness results of Di-LoMuF, for a special but still meaningful subset, where every link has the same capability of downloading and uploading, we can obtain an efficient approximation algorithm.
7.

We show that our results above can also be extended to other variants of LoMuF such as maximum sum flows, unsplittable flows, restricted candidate targets, maximum feasible flows, and so on. For the unsplittable version, we show that it cannot be approximated within ratio 2. For the version with restrictions on targets, it is NP-hard on uni-source supply vectors and stars and cannot be efficiently approximated within ratio $\frac{7}{6}$ on trees. For the maximum feasible flows version, we prove that for any constant $\epsilon>0$ , unless NP=ZPP, it cannot be approximated within $O(k^{1-\epsilon})$ on $k$ supply vectors.

This shows that the framework of the new location problems has a powerful capability of modeling different scenarios in practice and enriches the theory of location problems and network flow problems.

1.2 Related Work

There is an increasing vast literature on multi-commodity flow and its single-commodity special case [32]. Basically, there are two types of optimization objectives, namely, minimum cost and maximum flow which is the focus of this paper. The main theme of maximum flow in recent years is improving the efficiency of approximation algorithms [26, 14, 6, 21, 28, 23, 15, 33, 5]. The flow-cut duality is also a challenging issue and has attracted much attention from researchers [19, 31].

Facility location has flourished ever since the 1960s and remains an active topic in operations research and theoretical computer science [10, 20, 37, 20]. Though generally, no constant-ratio approximation algorithm exists, it can be constant-approximated on metric spaces. One of the main threads of research is to improve the approximation ratio in various situations [34, 22, 9].

Though inherently related to multi-commodity flow[1], facility location got to be combined with network flow only in 1990 [35], when the source location problem was proposed. Roughly speaking, the mission of the source location problem on a network is to find a set of sources from which enough flow can be sent to each prescribed target. In addition to flow requirements, connectivity and vertex coverage are also frequently used constraints. Work in this line can be classified into two categories. One is independent source location, meaning that the flows to different targets do not interact [3, 30, 4, 24, 25, 13, 3, 18]. The other is simultaneous source location, where the flows concurrently exist and interact by competing edge capacities [2, 12]. An interesting application is emergency evacuation planning [12], where shelters are to be located where residents in a disaster can move to as fast as possible. In such applications, capacities are also usually imposed on network nodes, rather than just on edges in typical network flow models. All the mentioned works have two common features. First, essentially only a single commodity is considered which is multi-source multi-target. Second, the objective is to optimize some measures of the selected sources (say, total cost), rather than the properties of the flow (say, flow value). This is in sharp contrast to our proposed problem.

2 Preliminaries and Problem Statement

In this section, we review key notions and notations used in this paper, and formally define the location problem.

2.1 Preliminaries

Let $\mathbb{R}$ ( $\mathbb{R}_{+},\mathbb{R}_{-}$ , respectively) represent the set of (non-negative, non-positive, respectively) real numbers. We use $\vec{x}$ for a vector, and $\vec{x}(y)$ for its $y$ -th entry. When we denote a set by an upper-case letter, we usually write the corresponding (subscripted) lower-case letter for the members.

A network is a capacitated graph $G=(V,E,\vec{c})$ , where $V$ is the vertex set, $E$ is the edge set, and $\vec{c}\in\mathbb{R}_{+}^{E}$ assigns capacities to the edges. We first mainly focus on undirected graphs in this paper, and will consider directed graphs in Section 4. For any $v,v^{\prime}\in V$ , we use $\langle v,v^{\prime}\rangle$ , or $\langle v^{\prime},v\rangle$ interchangeably, to denote the edge between $v,v^{\prime}$ . A commodity is described by a demand vector $\vec{d}\in\mathbb{R}^{V}$ satisfying $\sum_{v\in V}\vec{d}(v)=0$ , where any $v$ such that $\vec{d}(v)<0$ ( $\vec{d}(v)>0$ , respectively) is called a source (a target, respectively). Intuitively, each source $v$ has to sent out $\vec{d}(v)$ units of the commodity, and in total $\vec{d}(u)$ units are delivered to target $u$ . The vertex set of a graph $G$ is denoted by $V(G)$ .

To specify flows over a network, we always arbitrarily orient all the edges and keep the orientation implicit unless necessary. For any $v\in V$ , let $E_{-}(v)$ ( $E_{+}(v)$ , respectively) stands for the set of incoming (outgoing, respectively) edges. A flow is a vector $\vec{f}\in\mathbb{R}^{E}$ , which for any edge $e\in E$ , means $|\vec{f}(e)|$ units of transportation along $e$ in orientation if $\vec{f}(e)>0$ , and opposite direction otherwise. Given flows $\vec{f},\vec{f}^{\prime}\in\mathbb{R}^{E}$ , we write $\vec{f}\lesssim\vec{f}^{\prime}$ if $|\vec{f}|\leq|\vec{f}^{\prime}|$ for any $e\in E$ . A flow $\vec{f}$ is said to satisfy a demand vector $\vec{d}$ , if for any $v\in V$ , $\vec{d}(v)=\sum_{e\in E_{-}(v)}\vec{f}(e)-\sum_{e\in E_{+}(v)}\vec{f}(e)$ . A multi-commodity flow, which means a set $F$ of flows, is valid if its congestion $\sum_{\vec{f}\in F}|\vec{f}(e)|$ along any edge $e\in E$ is at most $\vec{c}(e)$ .

The maximum concurrent problem (MCF for short) has been extensively and is still being actively studied. Specifically, given demand vectors $\vec{d}_{i},1\leq i\leq k$ on a capacitated graph $G$ , the mission of MCF is to find the maximum $\lambda$ such that $\lambda\vec{d}_{i},1\leq i\leq k,$ can be satisfied by a valid multi-commodity flow on $G$ . The optimum $\lambda$ will be denoted by $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})$ .

Let’s recall some properties of MCF.

Lemma 1.

MCF lies in P.

As mentioned in [7, page 863], there is no known purely combinatorial algorithm solving MCF exactly and efficiently. The only commonly used algorithm is based on linear programming.

A multi-commodity flow $F$ on a capacitated graph $G$ is said to be a decomposition of flow $\vec{f}$ , if $\vec{f}(e)=\sum_{\vec{f}^{\prime}\in F}\vec{f}^{\prime}(e)$ and $|\vec{f}(e)|=\sum_{\vec{f}^{\prime}\in F}|\vec{f}^{\prime}(e)|$ for any edge $e$ of $G$ .

Lemma 2.

Arbitrarily fix a demand vector $\vec{d}$ on a capacitated graph $G$ . Suppose $\vec{d}$ has exactly one target $t$ . Then any flow satisfying $\vec{d}$ can be decomposed into a multi-commodity flow which satisfies the demand vectors $\{\vec{d}_{v}:v\textrm{ is a source of }\vec{d}\}$ . Here, each $\vec{d}_{v}$ is such that for any vertex $u$ of $G$ ,

\vec{d}_{v}(u)=\begin{cases}\vec{d}(v)&\textrm{if }u=v\\ -\vec{d}(v)&\textrm{if }u=t\\ 0&\textrm{otherwise}\end{cases}.

Note that the decomposition in Lemma 2 is not necessarily unique. Any such one will be called a canonical decomposition of the flow.

Given any vertex subset $U$ of an graph $G=(V,E,\vec{c})$ , the cut induced by $U$ , denoted by $Cut(U)$ , is defined to be the set of edges bridging $U$ and $V\setminus U$ . Let $E_{-}(U)=Cut(U)\bigcap(\bigcup_{u\in U}E_{-}(u))$ be the set of edges coming into $U$ , and $E_{+}(U)=Cut(U)\setminus E_{-}(U)$ .

Lemma 3.

Suppose that $\vec{f}$ is a flow satisfying a demand vector $\vec{d}$ on a capacitated graph $G=(V,E,\vec{c})$ . Then for any $U\subseteq V$ , $\sum_{e\in E_{-}(U)}\vec{f}(e)-\sum_{e\in E_{+}(U)}\vec{f}(e)=\sum_{u\in U}\vec{d}(u)$ .

2.2 Target Location problem

Intuitively, our goal is to properly locate targets for multiple commodities. We formulate this problem in this subsection.

Given a capacitated graph $G=(V,E,\vec{c})$ , any $\vec{s}\in\mathbb{R}_{-}^{V}$ is called a supply vector on $G$ . For any supply vector $\vec{s}$ and $v\in V$ , we define a demand vector $\vec{s}\circ v$ such that for any $u\in V$ ,

(\vec{s}\circ v)(u)=\begin{cases}\vec{s}(u)&\textrm{if }u\neq v\\ -\sum_{w\in V\setminus\{v\}}\vec{s}(w)&\textrm{otherwise}\end{cases}.

It is time to formulate the problem of target location for maximizing concurrent multi-commodity flow, LoMuF for short. Given supply vectors $\vec{s}_{1},\cdots,\vec{s}_{k}$ on a capacitated graph $G$ , LoMuF aims at finding $v_{1},\cdots,v_{k}$ such that $\lambda(G;\vec{s}_{1}\circ v_{1},\cdots,\vec{s}_{k}\circ v_{k})$ is maximized. By abuse of the notation, the optimum objective value is again denoted by $\lambda(G;\vec{s}_{1},\cdots,\vec{s}_{k})$ .

3 Hardness and Algorithms of LoMuF

We begin with studying the hardness of LoMuF. Our work refers to a well-known NP-complete problem, 3-dimensional matching (3-DM for short). Though LoMuF is NP-hard in general, we devise an algorithm solving LoMuF problems on trees efficiently, and show that a simple strategy could be a not-bad solution for graphs with bounded sources.

3.1 Hardness Result

A 3-DM instance is a quadruple $(X,Y,Z,W)$ , where $X,Y,Z$ are pairwise disjoint finite sets of equal size, and $W\subseteq\{\{x,y,z\}:x\in X,y\in Y,z\in Z\}$ . The goal is to decide whether $W$ contains a perfect matching, namely, a subset $W^{\prime}\subseteq W$ such that $|W^{\prime}|=|X|$ and $\bigcup_{w\in W^{\prime}}w=X\bigcup Y\bigcup Z$ ? The trivial cases where $\bigcup_{w\in W}w\neq X\bigcup Y\bigcup Z$ will not be considered.

We first show that LoMuF is NP-hard, which is more or less a surprise, compared with Lemma 1.

Theorem 4.

Given supply vectors $\vec{s}_{1},\cdots,\vec{s}_{k}$ on a capacitated graph $G$ , it is NP-complete to decide whether $\lambda(G;\vec{s}_{1},\cdots,\vec{s}_{k})\geq 1$ .

Proof.

Choose a target $v_{i}$ for supply vectors $\vec{s}_{i}$ , for any $1\leq i\leq k$ . Due to Lemma 1, we can use $v_{1},\cdots,v_{k}$ as a certificate to check whether $\lambda(G;\vec{s}_{1},\cdots,\vec{s}_{k})\geq 1$ . This means that the decision problem lies in NP.

To prove NP-completeness, it suffices to establish a reduction from 3-DM.

Given a 3-DM instance $(X,Y,Z,W)$ with $|X|=k$ and $|W|=l$ , we construct an capacitated graph $G=(V,E,\vec{c})$ as illustrated in Figure 2. Specifically, $G$ consists of three subgraphs $H_{X},H_{Y},H_{Z}$ connected via $W$ . $H_{X}$ is a complete bipartite graph of vertex sets $X$ and $T_{X}=\{t_{X},t^{\prime}_{X}\}$ , and any $x\in X$ is adjacent to $w\in W$ if and only if $x\in w$ , likewise for $H_{Y},H_{Z}$ . All the edges are oriented upward in Figure 2.

As to the capacity, let $E^{\prime}$ be the set of red edges, namely, those incident to $t^{\prime}_{X},t^{\prime}_{Y}$ or $t^{\prime}_{Z}$ . For any $e=\langle v,t\rangle\in E^{\prime}$ with $v\in X\bigcup Y\bigcup Z$ , let $W_{e}=\{w\in W:v\in w\}$ . Then for any $e\in E$ ,

\vec{c}(e)=\begin{cases}|W_{e}|-1&\textrm{if }e=E^{\prime}\\ 1&\textrm{otherwise}\end{cases}.

We define $l$ supply vectors $\vec{d}_{1}=\cdots=\vec{d}_{k},\vec{d}_{k+1}=\cdots=\vec{d}_{l}$ such that for any $v\in V$ ,

\vec{d}_{1}(v)=\begin{cases}-1&\textrm{if }v\in\{t_{X},t_{Y},t_{Z}\}\\ 0&\textrm{otherwise}\end{cases}

\vec{d}_{k+1}(v)=\begin{cases}-1&\textrm{if }v\in\{t^{\prime}_{X},t^{\prime}_{Y},t^{\prime}_{Z}\}\\ 0&\textrm{otherwise}\end{cases}

The rest of the proof is devoted to showing that $W$ has a perfect matching if and only if the LoMuF instance satisfies $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{l})\geq 1$ , which will lead to NP-completeness of our decision problem. The proof consists of two parts.

Part 1: a perfect matching in $W$ implies $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{l})\geq 1$ .

Without loss of generality, suppose $\{w_{1},\cdots,w_{k}\}\subset W$ is a perfect matching. For any $1\leq i\leq k$ , define flow $\vec{f}_{i}$ such that for any edge $e\in E$ ,

\vec{f}_{i}(e)=\begin{cases}1&\textrm{if }e\textrm{ is incident to }w_{i}\textrm{, or }e=\langle t,u\rangle\textrm{ with }t\in\{t_{X},t_{Y},t_{Z}\}\textrm{ and }u\in w_{i}\\ 0&\textrm{otherwise}\end{cases}.

For any $k+1\leq j\leq l$ , define flow $\vec{f}_{j}$ such that for any edge $e\in E$ ,

\vec{f}_{j}(e)=\begin{cases}1&\textrm{if }e\textrm{ is incident to }w_{j}\textrm{, or }e=\langle t,u\rangle\textrm{ with }t\in\{t^{\prime}_{X},t^{\prime}_{Y},t^{\prime}_{Z}\}\textrm{ and }u\in w_{j}\\ 0&\textrm{otherwise}\end{cases}.

It is straightforward to check that the multi-commodity flow $\vec{f}_{1},\cdots,\vec{f}_{l}$ is valid and satisfies the demand vectors $\vec{d}_{1}\circ w_{1},\cdots,\vec{d}_{l}\circ w_{l}$ . Hence, $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{l})\geq 1$ .

Part 2: $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{l})\geq 1$ implies a perfect matching in $W$ .

1.

Fact 1: $v_{i}\in W$ for any $1\leq i\leq l$ .
Consider the congestion of any $\vec{f}_{i}$ on $Cut(W)=Cut(V(H_{X}))\bigcup Cut(V(H_{Y}))\bigcup Cut(V(H_{Z}))$ . Let’s proceed case by case.
- •
  
  $v_{i}\in W$ . Applying Lemma 3 to $\vec{f}_{i},\vec{d}_{i}\circ v_{i}$ , we see that the congestion of $\vec{f}_{i}$ on $Cut(W)$ is at least 3.
- •
  
  $v_{i}\notin W$ . Without loss of generality, assume $v_{i}\in V(H_{X})$ . Applying Lemma 3 to $\vec{f}_{i},\vec{d}_{i}\circ v_{i}$ , we see that the congestion of $\vec{f}_{i}$ on $Cut(V(H_{X}))$ is at least 2, and those on $Cut(V(H_{Y}))$ and $Cut(V(H_{Z}))$ are both at least 1. Hence, the congestion of $\vec{f}_{i}$ on $Cut(W)$ is at least 4.
Since the total capacity of $Cut(W)$ is $3l$ which upper-bounds the total congestion, we get Fact 1.
2.

Fact 2: the sets $v_{1},\cdots,v_{k}$ are pairwise disjoint.

For contradiction, suppose without loss of generality that $v_{1}=w_{1},v_{2}=w_{2},x_{1}\in w_{1}\bigcap w_{2}$ . Applying Lemma 3 to multi-commodity flow $\vec{f}_{i},1\leq i\leq l$ and command vectors $\vec{d}_{i}\circ v_{i},1\leq i\leq l$ , we have $\sum_{e\in Cut(W)}\vec{f}_{i}(e)=3l$ . This implies that $\vec{f}_{i}(e)=1$ for any $e\in Cut(W)$ . Namely, each edge in $Cut(W)$ is full of upward flow. Likewise, each edge in $Cut(T_{X})$ is also full of upward flow. Let $e=\langle t_{X},x_{1}\rangle$ . Then we have $f_{1}(e^{\prime})=f_{2}(e^{\prime})=0$ for any edge $e^{\prime}\neq e$ in $H_{X}$ , since flow along such an edge can’t reach $w_{1}$ or $w_{2}$ . This, together with the precondition that $\vec{f}_{1}$ satisfies $\vec{d}_{1}\circ v_{1}$ , implies $\vec{f}_{1}(e)=1$ . Likewise, $\vec{f}_{2}(e)=1$ . A contradiction is reached since $\vec{c}(e)=1$ .

∎

3.2 LoMuF on Trees

Theorem 4 indicates that LoMuF is hard to solve on general graphs, but does not exclude the possibility of an efficient algorithm solving LoMuF for some important special case. Indeed, LoMuF on trees allows a fast algorithm, as presented in Algorithm 1. Actually, networks with tree structure is the also the center of related literature [30, 36, 5, 24, 25, 13, 2].

Without loss of generality, trees will be arbitrarily rooted, so the concepts of ancestors, descendants, and subtrees are well defined as usual. Given vertices $u,v$ of a tree, we write $u\prec v$ if $u$ is a descendant of $v$ , and $u\preceq v$ if $u\prec v$ or $u=v$ .

Let’s begin with a polynomial-time algorithm, which turns out to exactly solve LoMuF on trees.

Input: a capacitated tree $G=(V,E,\vec{c})$ , supply vectors $\vec{d}_{1},\cdots,\vec{d}_{k}$
Output: $v_{i}\in V,1\leq i\leq k$

1:for each

1\leq i\leq k

2: Let

v_{i}

be the lowest common ancestor of the sources of

\vec{d}_{i}

3: while there is a child

u

v_{i}

such that

\sum_{v\not\preceq u}|\vec{d}_{i}(v)|<\sum_{v\preceq u}|\vec{d}_{i}(v)|

4: Let

v_{i}

u

5:Output(

v_{1},\cdots,v_{k}

)

Algorithm 1 The algorithm for LoMuF on trees.

Theorem 5.

The output of Algorithm 1 is an optimum solution to LoMuF on trees.

Proof.

Given a capacitated tree $G=(V,E,\vec{c})$ and supply vectors $\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}_{-}^{V}$ , let $v_{1},\cdots,v_{k}$ be the output of Algorithm 1. Orient any edge of $G$ upward, i.e., from a vertex to its parent. The theorem is proven in two steps.

Step 1: Arbitrarily fix $1\leq i\leq k$ . We claim that for any $w\in V$ , any $\lambda>0$ , and any flow $\vec{f}$ satisfying $\lambda\vec{d}_{i}\circ w$ , there is a flow $\vec{f}^{\prime}\lesssim\vec{f}$ which satisfies $\lambda\vec{d}_{i}\circ v_{i}$ .

The claim is proved by induction on the hop distance (i.e., the number of edges) between $v_{i}$ and $w$ , denoted by $dist(v_{i},w)$ .

Basis: The claim trivially holds when $dist(v_{i},w)=0$ .

Hypothesis: The claim holds when $dist(v_{i},w)<\delta$ .

Induction: $dist(v_{i},w)=\delta>0$ . Let $x$ be the lowest common ancestor of the sources of $\vec{d}_{i}$ . We proceed case by case.

Case 1: $v_{i}\prec w$ .

If $x\prec w$ , set flow $\vec{f}^{\prime}$ such that for any edge $e\in E$ ,

\vec{f}^{\prime}(e)=\begin{cases}\vec{f}(e)&\textrm{if }e\textrm{ lies in the subtree rooted at }x\\ 0&\textrm{otherwise}\end{cases}.

One can easily check that $\vec{f}^{\prime}\lesssim\vec{f}$ and $\vec{f}^{\prime}$ satisfies $\lambda\vec{d}_{i}\circ x$ . Let $w^{\prime}=x$ .

If $w\preceq x$ , it must happen that $v_{i}=w$ at the beginning of some “while loop” of Algorithm 1 when handling $\vec{d}_{i}$ . That loop must assign $u$ to $v_{i}$ , where $u$ is the child of $w$ satisfying the condition in Line 3. Note that $u$ lies on the path between $w$ and the final $v_{i}$ . Set flow $\vec{f}^{\prime}$ such that for any edge $e\in E$ ,

\vec{f}^{\prime}(e)=\begin{cases}\lambda\sum_{v\not\preceq u}\vec{d}_{i}(v)&\textrm{if }e=\langle u,w\rangle\\ \vec{f}(e)&\textrm{otherwise}\end{cases}.

By Lemma 3, we see that $\vec{f}(\langle u,w\rangle)=\lambda\sum_{v\preceq u}|\vec{d}_{i}(v)|$ . Then the condition in Line 3 implies $\vec{f}^{\prime}\lesssim\vec{f}$ . Furthermore, one can check that $\vec{f}^{\prime}$ satisfies $\lambda\vec{d}_{i}\circ u$ . Let $w^{\prime}=u$ .

Case 2: $w\prec v_{i}$ . Let $y$ be the child of $v_{i}$ such that $w\preceq y$ . Let $E_{w,v_{i}}$ be the edges on the path between $w$ and $v_{i}$ . Define flow $\vec{f}^{\prime}$ such that for any edge $e\in E$ ,

\vec{f}^{\prime}(e)=\begin{cases}\lambda\sum_{v\preceq u}|\vec{d}_{i}(v)|&\textrm{if }e=\langle u,u^{\prime}\rangle\in E_{w,v_{i}}\textrm{ with }u\prec u^{\prime}\\ \vec{f}(e)&\textrm{otherwise}\end{cases}.

Since Algorithm 1 outputs $v_{i}$ rather than $y$ for $\vec{d}_{i}$ , it must hold that

\displaystyle\sum_{v\not\preceq y}|\vec{d}_{i}(v)|\geq\sum_{v\preceq y}|\vec{d}_{i}(v)|.

(1)

For any edge $e=\langle u,u^{\prime}\rangle\in E_{w,v_{i}}$ with $u\prec u^{\prime}$ , we have

\begin{array}[]{rll}|\vec{f}(e)|&=\lambda\sum_{v\not\preceq u}|\vec{d}_{i}(v)|&\textrm{by Lemma \ref{le:cutflow}}\\ &\geq\lambda\sum_{v\not\preceq y}|\vec{d}_{i}(v)|&\textrm{by }u\preceq y\\ &\geq\lambda\sum_{v\preceq y}|\vec{d}_{i}(v)|&\textrm{by Inequality \eqref{equa:conditioninline3}}\\ &\geq\lambda\sum_{v\preceq u}|\vec{d}_{i}(v)|&\textrm{by }u\preceq y\\ &=|\vec{f}^{\prime}(e)|.&\end{array}

Hence, $\vec{f}^{\prime}\lesssim\vec{f}$ . One can also check that $\vec{f}^{\prime}$ satisfies $\lambda\vec{d}_{i}\circ v_{i}$ . Let $w^{\prime}=v_{i}$ .

Case 3: neither $w\preceq v_{i}$ nor $v_{i}\preceq w$ . Let $y$ be the lowest common ancestor of $w$ and $v_{i}$ . We have either $x\prec y$ or $y\prec x$ .

If $x\prec y$ , define flow $\vec{f}^{\prime}$ such that for any edge $e\in E$ ,

\vec{f}^{\prime}(e)=\begin{cases}\vec{f}(e)&\textrm{if }e\textrm{ lies in the subtree rooted at }x\\ 0&\textrm{otherwise}\end{cases}.

Then $\vec{f}^{\prime}\lesssim\vec{f}$ and $\vec{f}^{\prime}$ satisfies $\lambda\vec{d}_{i}\circ v_{i}$ . Let $w^{\prime}=x$ .

If $y\prec x$ , $y$ lies on the path between $v_{i}$ and $x$ . Hence, it must happen that $v_{i}=y$ at the beginning of some “while loop” of Algorithm 1 when handling $\vec{d}_{i}$ . Then that loop does not choose the subtree of $y$ containing $w$ . Follow the argument of Case 2, there is a flow $\vec{f}^{\prime}\lesssim\vec{f}$ which satisfies the demand vector $\lambda\vec{d}_{i}\circ y$ . Let $w^{\prime}=y$ .

Altogether, we always have a flow $\vec{f}^{\prime}\lesssim\vec{f}$ which satisfies the demand vector $\lambda\vec{d}_{i}\circ w^{\prime}$ . Because $dist(v_{i},w^{\prime})<dist(v_{i},w)=\delta$ , we apply the induction hypothesis and finish step 1.

Step 2: Let $\lambda^{*}=\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})$ . Choose $w_{1},\cdots,w_{k}\in V$ such that there is a valid multi-commodity flow $\vec{f}_{1},\cdots,\vec{f}_{k}$ satisfying $\lambda^{*}\vec{d}_{1}\circ w_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ w_{k}$ . For any $1\leq i\leq k$ , apply the claim in step 1 to $w_{i}$ and $\vec{f}_{i}$ , resulting in a flow $\vec{f}^{\prime}_{i}\lesssim\vec{f}_{i}$ which satisfies $\lambda^{*}\vec{d}_{i}\circ v_{i}$ . Therefore, we get a valid multi-commodity flow $\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}$ satisfying $\lambda^{*}\vec{d}_{1}\circ v_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ v_{k}$ . This means that the output of Algorithm 1 is an optimum solution to LoMuF. ∎

3.3 Approximation Algorithm on General Graphs

Theorem 5 suggests that LoMuF is not extremely intractable, at least in a special case. Fortunately, the tractability can be extended to more general graphs, in the sense of approximation. Let’s begin with a lemma, which shows the important role of master sources (defined below) in approximating LoMuF.

Arbitrarily fix a supply vector $\vec{d}$ on a capacitated graph $G=(V,E,\vec{c})$ . Arbitrarily choose $\theta\geq|S|>1$ , where $S$ is the set of sources of $\vec{d}$ . Let $w$ be a master source of $\vec{d}$ , namely $w=\operatorname*{\mathop{\arg\max}}_{v\in V}|\vec{d}(v)|$ .

Lemma 6.

For any $u\in V$ and flow $\vec{f}$ satisfying $\vec{d}\circ u$ , there is a flow $\vec{f}^{\prime}\lesssim\vec{f}$ which satisfies $\frac{1}{\theta-1}\vec{d}\circ w$ .

Proof.

We proceed case by case.

Case 1: $u\notin S$ . For any $s\in S$ , define demand vector $\vec{d}_{s}$ such that for any $v\in V$ ,

\vec{d}_{s}(v)=\begin{cases}\vec{d}(s)&\textrm{if }v=s\\ -\vec{d}(s)&\textrm{if }v=u\\ 0&\textrm{otherwise}\end{cases}.

By Lemma 2, $\vec{f}$ has a decomposition $\{\vec{f}_{s}:s\in S\}$ satisfying $\{\vec{d}_{s}:s\in S\}$ .

Now for any $s\in S\setminus\{w\}$ , define flow $\vec{f}^{\prime}_{s}$ such that for any $e\in E$ ,

\vec{f}^{\prime}_{s}(e)=\frac{1}{\theta-1}\left(\vec{f}_{s}(e)-\vec{f}_{w}(e)\frac{\vec{d}(s)}{\vec{d}(w)}\right),

and demand vector $\vec{d}^{\prime}_{s}$ such that for any $v\in V$ ,

\vec{d}^{\prime}_{s}(v)=\begin{cases}\vec{d}(s)&\textrm{if }v=s\\ -\vec{d}(s)&\textrm{if }v=w\\ 0&\textrm{otherwise}\end{cases}.

Our task is reduced to establishing three claims.

Claim 1: for any $s\in S\setminus\{w\}$ , $\vec{f}^{\prime}_{s}$ satisfies $\frac{1}{\theta-1}\vec{d}^{\prime}_{s}$ .

It suffices to show $\phi(v,\vec{f}^{\prime}_{s})=\frac{1}{\theta-1}\vec{d}^{\prime}_{s}(v)$ for any $v\in V$ , where $\phi(x,\vec{g})=\sum_{e\in E_{-}(x)}\vec{g}(e)-\sum_{e\in E_{+}(x)}\vec{g}(e)$ , which is the net incoming of flow $\vec{g}$ at vertex $x$ . Obviously, $\phi(x,\vec{g})$ is linear in $\vec{g}$ .

Arbitrarily fix $v\in V$ . By definition of $\vec{f}^{\prime}_{s}$ ,

\begin{array}[]{rcl}\phi(v,\vec{f}^{\prime}_{s})&=&\frac{1}{\theta-1}\phi\left(v,\vec{f}_{s}-\vec{f}_{w}\frac{\vec{d}(s)}{\vec{d}(w)}\right)\\ &=&\frac{1}{\theta-1}\left(\phi(v,\vec{f}_{s})-\frac{\vec{d}(s)}{\vec{d}(w)}\phi(v,\vec{f}_{w})\right)\\ &=&\frac{1}{\theta-1}\left(\vec{d}_{s}(v)-\frac{\vec{d}(s)}{\vec{d}(w)}\vec{d}_{w}(v)\right)\qquad(\textrm{since }\vec{f}_{s},\vec{f}_{w}\textrm{ satisfy }\vec{d}_{s},\vec{d}_{w})\\ &=&\frac{1}{\theta-1}\vec{d}^{\prime}_{s}(v)\qquad(\textrm{by definition of }\vec{d}_{s},\vec{d}_{w},\vec{d}^{\prime}_{s})\end{array}

Claim 2: $\vec{f}^{\prime}=\sum_{s\in S\setminus\{w\}}\vec{f}^{\prime}_{s}$ satisfies $\frac{1}{\theta-1}\vec{d}\circ w$ . It immediately follows from Claim 1.

Claim 3: $\vec{f}^{\prime}\lesssim\vec{f}$ .

It holds because for any $e\in E$ ,

\begin{array}[]{rcl}|\vec{f}^{\prime}(e)|&=&|\sum_{s\in S\setminus\{w\}}\vec{f}^{\prime}_{s}(e)|\\ &\leq&\sum_{s\in S\setminus\{w\}}|\vec{f}^{\prime}_{s}(e)|\\ &=&\sum_{s\in S\setminus\{w\}}\frac{1}{\theta-1}\left|\vec{f}_{s}(e)-\vec{f}_{w}(e)\frac{\vec{d}(s)}{\vec{d}(w)}\right|\\ &\leq&\sum_{s\in S\setminus\{w\}}\frac{|\vec{f}_{s}(e)|}{\theta-1}+\frac{|\vec{f}_{w}(e)|}{\theta-1}\sum_{s\in S\setminus\{w\}}\frac{\vec{d}(s)}{\vec{d}(w)}\\ &\leq&\sum_{s\in S\setminus\{w\}}|\vec{f}_{s}(e)|+|\vec{f}_{w}(e)|\\ &=&|\vec{f}(e)|\end{array}

The proof of Case 1 finishes.

Case 2: $u=w\in S$ . The lemma trivially holds.

Case 3: $u\in S\setminus\{w\}$ .

The proof of Case 1 almost works, except that $\vec{d}_{u}$ is not well-defined and the decomposition of $\vec{f}$ does not include $\vec{f}_{u}$ . As a result, we still apply the proof of Case 1, after defining $\vec{f}_{u}\in\mathbb{R}^{E}$ and $\vec{d}_{u}\in\mathbb{R}^{V}$ to be all-zero vectors. ∎

Remark 1.

Lemma 6 remains true if $\theta$ is replaced by $\eta\geq\frac{\sum_{v\in V}\vec{d}(v)}{\vec{d}(w)}$ .

Algorithm 2 is a simple algorithm for LoMuF with guaranteed approximation ratio.

Input: a capacitated graph $G=(V,E,\vec{c})$ , supply vectors $\vec{d}_{1},\cdots,\vec{d}_{k}$
Output: $w_{i}\in V,1\leq i\leq k$

1:for each

1\leq i\leq k

2: Output

w_{i}=\operatorname*{\mathop{\arg\max}}_{v\in V}|\vec{d}_{i}(v)|

as the target of

\vec{d}_{i}

Algorithm 2 An approximation algorithm for LoMuF.

Theorem 7.

Algorithm 2 is $\max\{\theta-1,1\}$ -approximate, where $\theta=\max_{1\leq i\leq k}|\{v\in V:\vec{d}_{i}(v)<0\}|$ .

Proof.

Arbitrarily fix a capacitated graph $G=(V,E,\vec{c})$ and supply vectors $\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}_{-}^{V}$ as input to Algorithm 2. Let $w_{1},\cdots,w_{k}$ be the output. If $\theta=1$ , each $w_{i}$ the unique source of $\vec{d}_{i}$ , which is trivially optimum. Hence, we assume $\theta>1$ and show that establish approximation ratio $\theta-1$ .

Let $\lambda^{*}=\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})$ . Suppose $u_{1},\cdots,u_{k}\in V$ is an optimum solution to LoMuF. This means that there is a multi-commodity flow $\{\vec{f}_{1},\cdots,\vec{f}_{k}\}$ satisfying $\{\lambda^{*}\vec{d}_{1}\circ u_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ u_{k}\}$ .

For any $1\leq i\leq k$ , apply Lemma 6 with $w=w_{i},u=u_{i},\vec{f}=\vec{f}_{i},\vec{d}=\lambda^{*}\vec{d}_{i}$ , getting a flow $\vec{f}^{\prime}_{i}\lesssim\vec{f}_{i}$ which satisfies $\frac{\lambda^{*}}{\theta-1}\vec{d}_{i}\circ w_{i}$ . As a result, we find a valid multi-commodity flow $\{\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}\}$ satisfying $\{\frac{\lambda^{*}}{\theta-1}\vec{d}_{1}\circ w_{1},\cdots,\frac{\lambda^{*}}{\theta-1}\vec{d}_{i}\circ w_{i}\}$ , so $\lambda(G;\vec{d}_{1}\circ w_{1},\cdots,\vec{d}_{k}\circ w_{k})\geq\frac{\lambda^{*}}{\theta-1}$ . The proof ends. ∎

Remark 2.

By applying Remark 1 rather than Lemma 6, Theorem 7 remains true if $\theta$ is replaced by $\eta=\max_{1\leq i\leq k}\frac{\sum_{v\in V}\vec{d}_{i}(v)}{\vec{d}_{i}(w_{i})}$ which is not bigger than $\theta$ . Hereunder, this $\eta$ will be called concentration of the supply vectors. It intuitively indicates how much demands are concentrated on sources.

Note that in Remark 2, $\eta\leq 1$ if the $w_{i}$ -entry dominates $\vec{d}_{i}$ for any $1\leq i\leq k$ , namely $|\vec{d}_{i}(w_{i})|\geq\sum_{v\neq w_{i}}|\vec{d}_{i}(v)|$ . A special such case is when every supply vector has no more than 2 sources. Then by Remark 2, we immediately have the following corollary.

Corollary 8.

When every supply vector has a dominant entry, Algorithm 2 exactly solves LoMuF.

4 Hardness and Algorithms of Di-LoMuF

In this section, we adapt LoMuF to networks modeled as directed graphs. Such networks have also been studied in the network flow community and frequently appear in nowadays practice. For example, only down-streaming traffics are allowed by many data servers.

We adopt the notation and concepts in Section 2 in case of no ambiguity, with three exceptions:

•

Every edge has an inherent direction and is called an arc. An arc from vertex $u$ to vertex $v$ is denoted by $(u,v)$ . We usually use $G=(V,A,\vec{c})$ to represent a capacitated directed $G$ with vertex set $V$ , arc set $A$ , and capacity vector $\vec{c}\in\mathbb{R}_{+}^{A}$ . Accordingly, $A_{-}(v)$ ( $A_{+}(v)$ , respectively) stands for the set of incoming (outgoing, respectively) arcs at vertex $v$ . Likewise, define $A_{-}(U)$ and $A_{+}(U)$ for vertex subset $U\subseteq V$ .
•

Any arc only allows a flow in the inherent direction, so we can naturally specify a network flow using a non-negative vector $\vec{f}\in\mathbb{R}_{+}^{A}$ .
•

We continue to study the problem of target location for maximizing concurrent multi-commodity flow, but in the context of directed graphs. The problem will be called Di-LoMuF to highlight the directed model.

Note that Lemmas 1-3 still hold in the context of the directed graph model.

The following theorem indicates the strong relation between LoMuF and Di-LoMuF.

Theorem 9.

LoMuF is reducible to Di-LoMuF.

Proof.

Arbitrarily fix an capacitated graph $G=(V,E,\vec{c})$ and supply vectors $\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}^{V}$ . We will construct a capacitated direct graph $G^{\prime}=(V^{\prime},A,\vec{c}^{\prime})$ and supply vectors $\vec{d}^{\prime}_{1},\cdots,\vec{d}^{\prime}_{k}\in\mathbb{R}^{V^{\prime}}$ , and prove that the construction preserves the quality of solutions.

Step 1: Construct $G^{\prime}$ and the supply vectors.

The directed graph $G^{\prime}$ is obtained by replacing any edge of $G$ with the diamond gadget as illustrated in Figure 3. Specifically, $V^{\prime}=V\bigcup\{s_{e},t_{e}:e\in E\}$ , $A=\{(u,s_{e}),(v,s_{e}),(s_{e},t_{e}),(t_{e},u),(t_{e},v):e=\langle u,v\rangle\in E\}$ , and for any arc $a$ in the diamond corresponding to edge $e$ , $\vec{c}^{\prime}(a)=\vec{c}(e)$ . For any $1\leq i\leq k$ , define $\vec{d}^{\prime}_{i}$ such that for any $v\in V^{\prime}$ ,

\vec{d}^{\prime}_{i}(v)=\begin{cases}\vec{d}_{i}(v)&\textrm{if }v\in V\\ 0&\textrm{otherwise}\end{cases}.

Step 2: Prove that for any $v_{1},\cdots,v_{k}\in V$ , $\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k})\leq\lambda(G^{\prime};\vec{d}^{\prime}_{1}\circ v_{1},\cdots,\vec{d}^{\prime}_{k}\circ v_{k})$ .

Consider any $\lambda$ and any valid multi-commodity flow $\{\vec{f}_{1},\cdots,\vec{f}_{k}\}$ satisfying $\{\lambda\vec{d}_{1}\circ v_{1},\cdots,\lambda\vec{d}_{k}\circ v_{k}\}$ . For any $1\leq i\leq k$ , define flow $\vec{f}^{\prime}_{i}$ as follows: for any $e=\langle u,v\rangle\in E$ , if $\vec{f}_{i}(e)$ is from $u$ to $v$ , set $\vec{f}^{\prime}_{i}(u,s_{e})=\vec{f}^{\prime}_{i}(s_{e},t_{e})=\vec{f}^{\prime}_{i}(t_{e},v)=|\vec{f}_{i}(e)|$ , otherwise set $\vec{f}^{\prime}_{i}(v,s_{e})=\vec{f}^{\prime}_{i}(s_{e},t_{e})=\vec{f}^{\prime}_{i}(t_{e},u)=|\vec{f}_{i}(e)|$ ; $\vec{f}^{\prime}_{i}(a)=0$ for any other arc $a$ . It is straightforward to check that the multi-commodity flow $\{\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}\}$ is valid and satisfies $\{\lambda\vec{d}^{\prime}_{1}\circ v_{1},\cdots,\lambda\vec{d}^{\prime}_{k}\circ v_{k}\}$

Step 3: Prove that for any $v^{\prime}_{1},\cdots,v^{\prime}_{k}\in V^{\prime}$ , there are $v_{1},\cdots,v_{k}\in V$ such that $\lambda(G^{\prime};\vec{d}^{\prime}_{1}\circ v^{\prime}_{1},\cdots,\vec{d}^{\prime}_{k}\circ v^{\prime}_{k})\leq\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k})$ .

Consider any $\lambda$ and any valid multi-commodity flow $\{\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}\}$ satisfying $\{\lambda\vec{d}^{\prime}_{1}\circ v^{\prime}_{1},\cdots,\lambda\vec{d}^{\prime}_{k}\circ v^{\prime}_{k}\}$ . For any $1\leq i\leq k$ , define flow $\vec{f}_{i}$ as follows. For any $e=\langle u,v\rangle\in E$ oriented from $u$ to $v$ , we deal case by case:

•

When $v^{\prime}_{i}\notin\{s_{e},t_{e}\}$ , set $\vec{f}_{i}(e)=\vec{f}^{\prime}_{i}(u,s_{e})-\vec{f}^{\prime}_{i}(v,s_{e})$ .
•

When $v^{\prime}_{i}\in\{s_{e},t_{e}\}$ , set $\vec{f}_{i}(e)=\vec{f}^{\prime}_{i}(u,s_{e})$ if $\vec{f}^{\prime}_{i}(u,s_{e})<\vec{f}^{\prime}_{i}(v,s_{e})$ , otherwise $\vec{f}_{i}(e)=-\vec{f}^{\prime}_{i}(v,s_{e})$ .

Now for any $1\leq i\leq k$ , we find a proper $v_{i}\in V$ . This is also done case by case:

•

When $v^{\prime}_{i}\in V$ , let $v_{i}=v^{\prime}_{i}$ .
•

When $v^{\prime}_{i}\in\{s_{e},t_{e}\}$ for $e=\langle u,v\rangle$ , let $v_{i}=u$ if $\vec{f}^{\prime}_{i}(u,s_{e})>\vec{f}^{\prime}_{i}(v,s_{e})$ , otherwise $v_{i}=v$ .

Again, it is easy to check that the multi-commodity flow $\{\vec{f}_{1},\cdots,\vec{f}_{k}\}$ is valid and satisfies $\{\lambda\vec{d}_{1}\circ v_{1},\cdots,\lambda\vec{d}_{k}\circ v_{k}\}$ . ∎

Remark 3.

Theorem 9 implies that Di-LoMuF is at least as hard as LoMuF. Together with Theorem 4, Di-LoMuF is also NP-hard. More importantly, the reduction in the above proof preserves approximation ratio: any $\alpha$ -approximation algorithm of Di-LoMuF, combined with the reduction, also $\alpha$ -approximately solves LoMuF.

We further show that Di-LoMuF has no PTAS.

Theorem 10.

Unless P=NP, Di-LoMuF cannot be efficiently approximated with a ratio smaller than 2.

Proof.

We establish a reduction from 3-DM to Di-LoMuF and show that the solutions to Di-LoMuF has a big gap indicating whether or not a perfect matching exists.

Arbitrarily fix an instance $(X,Y,Z,W)$ of 3-DM. We construct an capacitated directed graph $G=(V,A,\vec{c})$ and $k=|X|$ supply vectors $\vec{d}_{1},\cdots,\vec{d}_{k}$ . Specifically, as illustrated in Figure 4, $G$ is adapted from the undirected graph in Figure 2, up to two modifications:

•

The red parts, namely, vertices $t^{\prime}_{X},t^{\prime}_{Y},t^{\prime}_{Z}$ and their incident edges, are removed.
•

All the arcs are directed upward, as indicated by the arrows.

All the arcs has capacity 1. Define supply vectors $\vec{d}_{1}=\cdots=\vec{d}_{k}$ such that for any $v\in V$ ,

\vec{d}_{1}(v)=\begin{cases}-1&\textrm{if }v\in\{t_{X},t_{Y},t_{Z}\}\\ 0&\textrm{otherwise}\end{cases}.

Our theorem immediately holds if we have the following two facts:

Fact 1: If $W$ contains a perfect matching, $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})\geq 1$ .

To prove this fact, suppose without loss of generality that $\{w_{1},\cdots,w_{k}\}$ is a perfect matching in $W$ . For any $1\leq i\leq k$ , assume $w_{i}=\{x,y,z\}$ with $x\in X,y\in Y,z\in Z$ , and define a flow $\vec{f}_{i}$ such that for any arc $a\in A$ ,

\vec{f}_{i}(a)=\begin{cases}1&\textrm{if }a\in\{(t_{X},x),(x,w_{i}),(t_{Y},y),(y,w_{i}),(t_{Z},z),(z,w_{i})\}\\ 0&\textrm{otherwise}\end{cases}.

It is straightforward to check that the multi-commodity flow $\{\vec{f}_{i}:1\leq i\leq k\}$ is valid and satisfies $\{\vec{d}_{i}\circ w_{i}:1\leq i\leq k\}$ . Hence, $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})\geq 1$ .

Fact 2: If $W$ contains no perfect matching, $\lambda^{*}=\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})\leq\frac{1}{2}$ .

Let $v_{1},\cdots,v_{k}\in V$ be such that $\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k})=\lambda^{*}$ . One immediately sees that $v_{i}\in W$ for any $1\leq i\leq k$ , unless $\lambda^{*}=0$ . Without loss of generality, assume that $v_{i}=w_{i}$ for any $1\leq i\leq k$ .

Let $\{\vec{f}_{i}:1\leq i\leq k\}$ be a valid multi-commodity flow that satisfies $\{\lambda^{*}\vec{d}_{i}\circ w_{i}:1\leq i\leq k\}$ .

Since $\{w_{i}:1\leq i\leq k\}$ is not a perfect matching, there must be $v\in X\bigcup Y\bigcup Z$ such that $|\{i:1\leq i\leq k,v\in w_{i}\}|\geq 2$ . Again without loss of generality, assume that $v=x_{1}\in X$ and $v\in w_{1}\bigcap w_{2}$ . For any $i\in\{1,2\}$ , one can observe that $\vec{f}_{i}(t_{X},x_{j})=0$ for any $2\leq j\leq k$ , because a flow on such an arc can not reach $w_{1}$ or $w_{2}$ .

Then by Lemma 3, $\vec{f}_{1}(a)=\vec{f}_{2}(a)=\lambda^{*}$ where $a=(t_{X},x_{1})$ . Considering that $1=\vec{c}(a)\geq\vec{f}_{1}(a)+\vec{f}_{2}(a)$ , we have $\lambda^{*}\leq\frac{1}{2}$ . ∎

To investigate the borderline of the intractability of Di-LoMuF, one might impose restrictions on instances to make them simple. One dimension of simplification is to upper bound the source number of the supply vectors. When every supply vector has only one source, the sources altogether form a trivial optimum solution to Di-LoMuF. Hence it is reasonable to focus on bi-source supply vectors, namely those each having at most two sources. Another dimension of simplification is to focus on simple graphs, so directed trees (called di-trees) are natural candidates. A di-tree is a directed graph which, after removing the directions of the arcs and neglecting multi-edges, becomes an undirected tree. A di-path can be defined likewise. To make our result as strong as possible, we further require that the di-trees are symmetric. A capacitated directed graph is called symmetric, if (1) all arcs have equal capacity, and (2) once having an arc $(u,v)$ , it also has the twin arc $(v,u)$ . We will show that Di-LoMuF remains hard even on these nearly trivial instances.

Before continuing, recall the 3-partition problem, which is well-known to be strongly NP-hard [8, page 99]. An instance of the 3-partition problem is a multi-set $S$ of positive integers with $|S|=3m$ for some integer $m$ . The objective is to decide whether $S$ has an equi-partition, namely a partition $S_{1},\cdots,S_{m}$ of $S$ such that $\sum_{s\in S_{i}}s=\sum_{s\in S_{j}}s$ for any $1\leq i,j\leq m$ .

Theorem 11.

Di-LoMuF is NP-hard on symmetric di-paths and bi-source supply vectors

Proof.

We prove the theorem via a reduction from 3-partition problems to Di-LoMuF. For this end, given an instance $S=\{s_{1},\cdots,s_{3m}\}$ of 3-partition problem, we set about to construct a symmetric di-path and $(5m-2)$ bi-source supply vectors.

Specifically, as illustrated in Figure 5, the di-path $G=(V,A,\vec{c})$ consists of $m$ vertices $v_{1},\cdots,v_{m}$ and arcs $a_{i}=(v_{i},v_{i+1})$ and $a^{\prime}_{i+1}=(v_{i+1},v_{i})$ for any $1\leq i<m$ . Each arc has capacity $mB$ , where $B=\frac{\sum_{s\in S}s}{m}$ . For any $1\leq i\leq 3m$ and $1\leq j<m$ , define supply vectors $\vec{d}_{i},\vec{d}^{\prime}_{j},\vec{d}^{\prime\prime}_{j}$ such that for any $v\in V$ ,

\vec{d}_{i}(v)=\begin{cases}-s_{i}&\textrm{if }v\in\{v_{1},v_{m}\}\\ 0&\textrm{otherwise}\end{cases}

\vec{d}^{\prime}_{j}(v)=\begin{cases}-(mB+1)&\textrm{if }v=v_{j}\\ -(m-j)B&\textrm{if }v=v_{j+1}\\ 0&\textrm{otherwise}\end{cases},

\vec{d}^{\prime\prime}_{j}(v)=\begin{cases}-jB&\textrm{if }v=v_{j}\\ -(mB+1)&\textrm{if }v=v_{j+1}\\ 0&\textrm{otherwise}\end{cases}.

For notational simplicity, we sometimes use $\vec{d}_{3m+1},\cdots,\vec{d}_{5m-2}$ to stand for $\vec{d}^{\prime}_{1},\cdots,\vec{d}^{\prime}_{m-1},\vec{d}^{\prime\prime}_{1},\cdots,\vec{d}^{\prime\prime}_{m-1}$ , respectively.

Our proof will be done in two steps.

Step 1. If $S$ has an equi-partition, then $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{5m-2})\geq 1$ .

Let $S_{1},\cdots,S_{m}$ be an equi-partition of $S$ . For any $1\leq i\leq 3m$ , let $1\leq j\leq m$ satisfy $s_{i}\in S_{j}$ , and we define flow $\vec{f}_{i}$ such that for any $a\in A$ ,

\vec{f}_{i}(a)=\begin{cases}s_{i}&\textrm{if }a\in\{a_{k}:1\leq k<j\}\bigcup\{a^{\prime}_{k}:j<k\leq m\}\\ 0&\textrm{otherwise}\end{cases}.

One can check that $\vec{f}_{i}$ satisfies demand vector $\vec{d}_{i}\circ v_{j}$ .

For any $1\leq i\leq m-1$ , define flows $\vec{f}^{\prime}_{i},\vec{f}^{\prime\prime}_{i}$ such that for any $a\in A$ ,

\vec{f}^{\prime}_{i}(a)=\begin{cases}(m-i)B&\textrm{if }a=a^{\prime}_{i+1}\\ 0&\textrm{otherwise}\end{cases},

\vec{f}^{\prime\prime}_{i}(a)=\begin{cases}iB&\textrm{if }a=a_{i}\\ 0&\textrm{otherwise}\end{cases}.

Obviously, $\vec{f}^{\prime}_{i}$ satisfies $\vec{d}^{\prime}_{i}\circ v_{i}$ , and $\vec{f}^{\prime\prime}_{i}$ satisfies $\vec{d}^{\prime\prime}_{i}\circ v_{i+1}$ .

It is straightforward to check that all these flows form a valid multi-commodity flow. Altogether, we have $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{5m-2})\geq 1$ .

Step 2. If $\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{5m-2})\geq 1$ , $S$ has an equi-partition.

Suppose $u_{i},u^{\prime}_{j},u^{\prime\prime}_{j}\in V,1\leq i\leq 3m,1\leq j\leq m-1,$ are such that there is a valid multi-commodity flow $\vec{f}_{i},\vec{f}^{\prime}_{j},\vec{f}^{\prime\prime}_{j},1\leq i\leq 3m,1\leq j\leq m-1,$ satisfying demand vectors $\vec{d}_{i}\circ u_{i},\vec{d}^{\prime}_{j}\circ u^{\prime}_{j},\vec{d}^{\prime\prime}_{j}\circ u^{\prime\prime}_{j},1\leq i\leq 3m,1\leq j\leq m-1$ .

For any $1\leq i\leq m$ , let $V_{i}=\{v_{1}\cdots,v_{i}\}$ . We proceed in two substeps.

Step 2.1. For any $1\leq i<m$ , $u^{\prime}_{i}=v_{i}$ and $u^{\prime\prime}_{i}=v_{i+1}$ .

Arbitrary fix $1\leq i<m$ . For contradiction, assume that $u^{\prime}_{i}\notin V_{i}$ . Applying Lemma 3 to $\vec{f}^{\prime}_{i},\vec{d}^{\prime}_{i}\circ u^{\prime}_{i},V_{i}$ , one get $\vec{f}^{\prime}_{i}(a_{i})\geq mB+1$ , contradictory to the fact that $\vec{c}(a_{i})=mB$ . Hence, $u^{\prime}_{i}\in V_{i}$ . Likewise, one can further show that $u^{\prime}_{i}\notin V_{i-1}$ . As a result, $u^{\prime}_{i}=v_{i}$ .

In a similar way, we also have $u^{\prime\prime}_{i}=v_{i+1}$ .

Step 2.2. $S$ has an equi-partition.

Arbitrarily fix $1\leq i<m$ . Applying Lemma 3 to $\vec{f}^{\prime}_{i},\vec{d}^{\prime}_{i}\circ u^{\prime}_{i},V_{i}$ and to $\vec{f}^{\prime\prime}_{i},\vec{d}^{\prime\prime}_{i}\circ u^{\prime\prime}_{i},V_{i}$ respectively, one gets

\displaystyle\vec{f}^{\prime}_{i}(a^{\prime}_{i+1})\geq(m-i)B,\vec{f}^{\prime\prime}_{i}(a_{i})\geq iB.

(2)

Let $J_{i}=\{j:1\leq j\leq 3m,u_{j}\in V_{i}\}$ . For any $j\in J_{i}$ , apply Lemma 3 to $\vec{f}_{j},\vec{d}_{j}\circ u_{j},V_{i}$ , and we have

\displaystyle\vec{f}_{j}(a^{\prime}_{i+1})\geq s_{j}.

(3)

Likewise, for any $j\notin J_{i}$ , applying Lemma 3 to $\vec{f}_{j},\vec{d}_{j}\circ u_{j},V_{i}$ results in

\displaystyle\vec{f}_{j}(a_{i})\geq s_{j}.

(4)

Then,

\begin{split}\begin{array}[]{rll}2mB&=\sum_{1\leq j\leq 3m}s_{j}+iB+(m-i)B&\\ &=\sum_{j\in J_{i}}s_{j}+\sum_{j\notin J_{i}}s_{j}+iB+(m-i)B&\\ &\leq\sum_{j\in J_{i}}\vec{f}_{j}(a^{\prime}_{i+1})+\sum_{j\notin J_{i}}\vec{f}_{j}(a_{i})&\\ &\quad+\vec{f}^{\prime\prime}_{i}(a_{i})+\vec{f}^{\prime}_{i}(a^{\prime}_{i+1})&\qquad\textrm{ by (\ref{equa:occupycapa1})-(\ref{equa:occupycapa3})}\\ &\leq\vec{c}(a_{i})+\vec{c}(a^{\prime}_{i+1})=2mB&\qquad\textrm{ by capacity constraints}\end{array}\end{split}

(5)

As a result, all the inequalities in (2)-(5) are actually equalities. Hence,

\sum_{j\in J_{i}}s_{j}=\sum_{j\in J_{i}}\vec{f}_{j}(a^{\prime}_{i+1})=mB-\vec{f}^{\prime}_{i}(a^{\prime}_{i+1})=iB.

Let $J_{0}=\emptyset$ and $J_{m}=\{j:1\leq j\leq 3m\}$ . For any $1\leq i\leq m$ , define $S_{i}=\{s_{j}:j\in J_{i}\setminus J_{i-1}\}$ , which satisfies $\sum_{s\in S_{i}}s=\sum_{j\in J_{i}\setminus J_{i-1}}s_{j}=\sum_{j\in J_{i}}s_{j}-\sum_{j\in J_{i-1}}s_{j}=B$ . This means that $S_{1},\cdots,S_{m}$ is an equi-partition of $S$ .∎

Remark 4.

Recall Corollary 8 which implies the tractability of LoMuF on bi-source supply vectors. It is in sharp contrast to the intractability of Di-LoMuF in this situation. Furthermore, Theorem 5 claims that LoMuF is polynomial-time solvable when the input graph is a tree, but Di-LoMuF remains NP-hard even on symmetric di-paths. These serves as an evidence that LoMuF is generally harder than LoMuF.

We have seen the hardness of Di-LoMuF even in the nearly-trivial cases. Fortunately, the next theorem will relieve us from frustration, because it indicates the possibility to approximately solve Di-LoMuF. A new definition is needed.

Given a capacitated directed graph $G=(V,A,\vec{c})$ , for any $u,v\in V$ , let $A_{\{u,v\}}=\{(u,v),(v,u)\}\bigcap A$ . Define the induced graph of $G$ to be the capacitated undirected graph $G^{\prime}=(V,E,\vec{c}^{\prime})$ , where $E=\{\langle u,v\rangle:u,v\in V,A_{\{u,v\}}\neq\emptyset\}$ , and for any $e=\langle u,v\rangle\in E$ , $\vec{c}^{\prime}(e)=\sum_{a\in A_{\{u,v\}}}\vec{c}(a)$ . Intuitively, $G^{\prime}$ is obtained from $G$ by neglecting the direction of the arcs and merging the capacities of twin arcs if any.

Theorem 12.

Di-LoMuF has a polynomial-time 2-approximation algorithm on symmetric di-trees.

Proof.

Arbitrarily fix a symmetric di-tree $G=(V,A,\vec{c})$ and supply vectors $\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}_{-}^{V}$ . Let $G^{\prime}=(V,E,\vec{c}^{\prime})$ be the induced graph of $G$ and arbitrarily orient the edges. Suppose $v_{1},\cdots,v_{k}$ be the output of Algorithm 1 when the input is $(G^{\prime};\vec{d}_{1},\cdots,\vec{d}_{k})$ . We set about to prove that $v_{1},\cdots,v_{k}$ is a 2-approximate solution to Di-LoMuF on the instance $(G;\vec{d}_{1},\cdots,\vec{d}_{k})$ .

Let $\lambda^{*}=\lambda(G;\vec{d}_{1},\cdots,\vec{d}_{k})$ and $\lambda^{\prime*}=\lambda(G^{\prime};\vec{d}_{1},\cdots,\vec{d}_{k})$ . Our task is reduced to proving two claims.

Claim 1. $\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k})\geq\frac{\lambda^{\prime*}}{2}$ .

Let $\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}\in\mathbb{R}^{E}$ be a valid multi-commodity flow satisfying $\lambda^{\prime*}\vec{d}_{1}\circ v_{1},\cdots,\lambda^{\prime*}\vec{d}_{k}\circ v_{k}$ , where $\lambda^{\prime*}=\lambda(G^{\prime};\vec{d}_{1},\cdots,\vec{d}_{k})$ .

For any $1\leq i\leq k$ , define flow $\vec{f}_{i}\in\mathbb{R}_{+}^{A}$ such that for any arc $(u,v)\in A$ ,

\vec{f}_{i}(u,v)=\begin{cases}\frac{|\vec{f}^{\prime}_{i}(e)|}{2}&\begin{array}[]{l}\textrm{if either the orientation of }e=\langle u,v\rangle\textrm{ is from }u\textrm{ to }v\textrm{ and }\vec{f}^{\prime}_{i}(e)>0\\ \textrm{or the orientation is from }v\textrm{ to }u\textrm{ and }\vec{f}^{\prime}_{i}(e)<0\end{array}\\ 0&\textrm{otherwise}\end{cases}.

It is straightforward to check that $\vec{f}_{1},\cdots,\vec{f}_{k}$ is a valid multi-commodity flow on $G$ that satisfies $\frac{\lambda^{\prime*}}{2}\vec{d}_{1}\circ v_{1},\cdots,\frac{\lambda^{\prime*}}{2}\vec{d}_{k}\circ v_{k}$ . Hence, Claim 1 holds.

Claim 2. $\lambda^{\prime*}\geq\lambda^{*}$ .

Let $u_{1},\cdots,u_{k}\in V$ be such that there is a valid multi-commodity flow $\vec{f}_{1},\cdots,\vec{f}_{k}\in\mathbb{R}_{+}^{A}$ on $G$ which satisfies $\lambda^{*}\vec{d}_{1}\circ u_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ u_{k}$ . For any $1\leq i\leq k$ , define flow $\vec{f}^{\prime}_{i}\mathbb{R}^{E}$ on $G^{\prime}$ as follows: for any edge $e=\langle u,v\rangle\in E$ , if it is oriented from $u$ to $v$ in $G^{\prime}$ , set $\vec{f}^{\prime}_{i}(e)=\vec{f}_{i}(u,v)-\vec{f}_{i}(v,u)$ . Roughly speaking, each $\vec{f}^{\prime}_{i}$ is obtained from $\vec{f}_{i}$ by merging traffics on twin arcs.

Again, it is easy to check that $\vec{f}^{\prime}_{1},\cdots,\vec{f}^{\prime}_{k}$ is a valid multi-commodity flow on $G^{\prime}$ that satisfies $\lambda^{*}\vec{d}_{1}\circ u_{1},\cdots,\lambda^{*}\vec{d}_{k}\circ u_{k}$ . This immediately leads to Claim 2.

Combining Claims 1 and 2, we have $\lambda(G;\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{k}\circ v_{k})\geq\frac{\lambda^{*}}{2}$ , which means that $v_{1},\cdots,v_{k}$ is a 2-approximate solution to Di-LoMuF on the instance $(G;\vec{d}_{1},\cdots,\vec{d}_{k})$ . ∎

Theorem 12 can be extended to general symmetric directed graphs. Recall the concept concentration defined in Remark 2.

Corollary 13.

Di-LoMuF has a polynomial-time $2\cdot\max\{\eta-1,1\}$ -approximation algorithm on symmetric directed graphs, where $\eta$ the concentration of the supply vectors.

Proof.

We follow the proof of Theorem 12. The only difference is that Algorithm 2 rather than Algorithm 1 is invoked. This modification is necessary, since Algorithm 1 is unfit for general undirected graphs.

The detailed proof is omitted. ∎

5 Other Variants of LoMuF

We continue to handle other variants of LoMuF, which are defined by extending LoMuF in three dimensions:

1.

Different network models. In Section 4, we have thoroughly studied directed and undirected graphs. This section will consider the unsplittable flow model, which means that any flow from a source to a target is along one path. Such a flow model has been actively studied in the literature [17].
2.

Different solution constraints. We restrict the targets to be chosen from a candidate set, rather than from the entire vertex set. This properly models the practical situation where applications can be deployed to prescribed servers. Such a restricted version of LoMuF is called restricted-LoMuF.
3.

Different optimization goals. The network flow community typically serves three optimization goals: concurrent flow value which proportionately maximizes the flows, total flow value which maximizes the summation of all flows, and feasibility which maximize the number of feasible flows. Since concurrent flow value has been elaborated on in the previous sections, this section will investigate the latter two.

Now we begin to present some results of the variants.

Unsplittable flow: A flow is unsplittable if it can be decomposed into flow paths each of which corresponds to the flow from one source to the target and the correspondence is one-to-one. By a flow path, we mean a flow which has non-zero congestion only along a path, and we say that a flow path passes an edge if the flow has non-zero congestion on the edge.

Since on trees there is a unique path connecting any two vertices, flows on trees are intrinsically unsplittable. Consequently, by Theorem 5, even under the unsplittable flow model, LoMuF on trees is polynomial-time solvable. Actually, all the results in the previous sections remain true under the unsplittable flow model, since all the flows in the proofs are unsplittable. Moreover, stronger results can be obtained. See the following theorem as an example.

Theorem 14.

Under the unsplittable flow model, LoMuF is NP-hard and cannot be approximated within ratio 2 in polynomial time.

Proof.

Roughly speaking, we reduce 3-DM to LoMuF, and show that the solutions to LoMuF has a big gap of unsplittable flows indicating whether or not a perfect matching exists.

Basically, we follow the proof of Theorem 4. Given an instance $(X,Y,Z,W)$ of 3-DM with $|X|=k$ and $|W|=l$ , let $G=(V,E,\vec{c})$ be the capacitated undirected graph as constructed in the proof of Theorem 4 (illustrated in Figure 2). We also adopt supply vectors $\vec{d}_{i},1\leq i\leq l$ as in the proof of Theorem 4. For ease of reading, the vectors are redefined here. For any $1\leq i\leq k,k+1\leq j\leq l,v\in V$ ,

\vec{d}_{i}(v)=\begin{cases}-1&\textrm{if }v\in\{t_{X},t_{Y},t_{Z}\}\\ 0&\textrm{otherwise}\end{cases}

\vec{d}_{j}(v)=\begin{cases}-1&\textrm{if }v\in\{t^{\prime}_{X},t^{\prime}_{Y},t^{\prime}_{Z}\}\\ 0&\textrm{otherwise}\end{cases}

The rest of the proof consists of two parts.

Part 1: a perfect matching in $W$ implies $\lambda_{uf}(G;\vec{d}_{1},\cdots,\vec{d}_{k})\geq 1$ , where the subscript $uf$ indicates that the objective value is under the unsplittable flow model.

The proof is identical to the counterpart of the proof of Theorem 4, so omitted here.

Part 2: If $W$ contains no perfect matching, $\lambda^{*}=\lambda_{uf}(G;\vec{d}_{1},\cdots,\vec{d}_{k})\leq\frac{1}{2}$ .

Suppose the optimum targets are $v_{1},\cdots,v_{l}$ , and the multi-commodity flow $\vec{f}_{1},\cdots,\vec{f}_{l}$ is valid and satisfies $\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{l}\circ v_{l}$ . Under the unsplittable flow model, for any $1\leq i\leq l$ , $\vec{f}_{i}=\vec{f}_{i,X}+\vec{f}_{i,Y}+\vec{f}_{i,Z}$ where $\vec{f}_{i,X}$ , called a summand path of $\vec{f}_{i}$ , is a flow path from $t_{X}$ (or $t^{\prime}_{X}$ when $l>k$ ) to $v_{i}$ , and likewise for $\vec{f}_{i,Y},\vec{f}_{i,Z}$ .

Let $E_{W}$ be the set of edges incident to vertices in $W$ . For $1\leq i\leq l$ , it is easy to observe two facts:

Fact 1: : If $v_{i}\in W$ , each summand path of $\vec{f}_{i}$ has non-zero congestion on at least one edge in $E_{W}$ .
Fact 2: : If $v_{i}\notin W$ , there are two summand paths of $\vec{f}_{i}$ each having non-zero congestions on at least two edges in $E_{W}$ .

Now we proceed case by case.

Case 1: $v_{i}\notin W$ for some $1\leq i\leq l$ . By Facts 1 and 2, considering that there are $3l$ edges in $E_{W}$ and $3l$ summand paths of all the flows, there must be an edge $e\in E_{W}$ shared by at least two flow paths. Since a flow path has congestion $\lambda^{*}$ on any edge along it and each edge has capacity 1, we see that $\lambda^{*}\leq\frac{1}{2}$ .

Case 2: $v_{i}=v_{j}\in W$ for some $1\leq i\neq j\leq l$ . $\vec{f}_{i}$ and $\vec{f}_{j}$ altogetger have six summand paths, each of which arrives $v_{i}$ . However, $v_{i}$ has only three incident edges, so at least two of the summand paths share an incident edge of $v_{i}$ . Again, since a flow path has congestion $\lambda^{*}$ on any edge along it and each edge has capacity 1, we have $\lambda^{*}\leq\frac{1}{2}$ .

Case 3: $v_{i}$ ’s lie in $W$ and are pairwise different. Assume without loss of generality that $v_{i}=w_{i}$ , for any $1\leq i\leq l$ . Because $W$ contains no perfect matching, there exist $1\leq i,j\leq k$ such that $w_{i}\bigcap w_{j}\neq\emptyset$ . Again without loss of generality, assume $x\in X\bigcap w_{i}\bigcap w_{j}$ .

If $\vec{f}_{i,X}$ does not pass the edge $\langle t_{X},x\rangle$ , it must pass more than one edge in $E_{W}$ before reaching $w_{i}$ . Following the argument of Case 1, we see that $\lambda^{*}\leq\frac{1}{2}$ . Likewise, we have $\lambda^{*}\leq\frac{1}{2}$ if $\vec{f}_{j,X}$ does not pass the edge $\langle t_{X},x\rangle$ .

What’s remaining is when both $\vec{f}_{i,X}$ and $\vec{f}_{j,X}$ pass the edge $\langle t_{X},x\rangle$ . One gets $\lambda^{*}\leq\frac{1}{2}$ due to the capacity constraint on this edge. ∎

Then we show that restricting targets (i.e., targets can be chosen only in a candidate set of vertices) substantially affects the hardness of target location problems. Since the unrestricted version is a special case of the restricted one, all the hardness results (including the lower bounds of the approximation ratios) remain valid. In fact, restricting targets may make the problems harder, which is confirmed below. Recall Theorem 5 which claims that LoMuF on trees is polynomial-time solvable. Nevertheless, with restricted targets, LoMuF on trees even has no PTAS.

Before going on, let’s recall a property of 3-DM. Let $(X,Y,Z,W)$ be an instance of 3-DM. For any $u\in X\bigcup Y\bigcup Z$ , define its covering set to be $\xi(u)=\{w\in W:u\in w\}$ . It is known that $(X,Y,Z,W)$ remains NP-complete even on 3-covered instances, namely, $\max_{u\in X\bigcup Y\bigcup Z}|\xi(u)|\leq 3$ [8, page 221].

Theorem 15.

LoMuF with restricted targets is NP-hard on trees and cannot be approximated within ratio $\frac{7}{6}$ in polynomial-time.

Proof.

We prove the theorem by reducing 3-DM to LoMuF.

Arbitrarily fix a 3-covered instance $(X,Y,Z,W)$ of 3-DM. Let $k=|X|$ and $l=|W|$ with $W=\{w_{1},\cdots,w_{l}\}$ . We will construct an instance of LoMuF with restricted targets, including a capacitated undirected graph $G=(V,E,\vec{c})$ , $l+2k$ supply vectors, and a candidate set of vertices in which the targets of the supply vectors can be located.

Specifically, as illustrated in Figure 6, $G$ is a tree consisting of a root $r$ and the set $W$ of leaves, and the capacity of each edge is 6. All the edges are oriented from leaves to the root.

Let $U=X\bigcup Y\bigcup Z$ . For any $u\in U$ , define a supply vector $\vec{d}_{u}$ such that for any $v\in V$ ,

\vec{d}_{u}(v)=\begin{cases}-1&\textrm{if }u\in v\in W\\ |\xi(u)|-3&\textrm{if }v=r\\ 0&\textrm{otherwise}\end{cases}.

For any $1\leq i\leq l-k$ , define a supply vector $\vec{d}_{i}$ such that for any $v\in V$ ,

\vec{d}_{i}(v)=\begin{cases}-3&\textrm{if }v=r\\ 0&\textrm{otherwise}\end{cases}.

Let the candidate set be $W$ , i.e., we are not allowed to choose the root as targets.

The rest of the proof consists of two parts.

Part 1: If $W$ contains a perfect matching, $\lambda_{W}(G;\vec{d}_{1},\cdots,\vec{d}_{l+2k})\geq 1$ , where the subscript $W$ indicates that the candidate set for the targets is $W$ .

Without loss of generality, suppose $W^{\prime}\subseteq W$ is a perfect matching. Let $\phi:U\rightarrow\{1,\cdots,l\}$ be the mapping such that $u\in w_{\phi(u)}\in W^{\prime}$ for any $u\in U$ . For any $u\in U$ , define flow $\vec{f}_{u}$ such that for any edge $e=\langle r,w\rangle$ ,

\vec{f}_{u}(e)=\begin{cases}-2&\textrm{if }w=w_{\phi(u)}\\ 1&\textrm{if }w\in\xi(u)\setminus\{w_{\phi(u)}\}\\ 0&\textrm{otherwise}\end{cases}.

One can check that $\vec{f}_{u}$ satisfies the demand vector $\vec{d}_{u}\circ w_{\phi(u)}$ .

Then arbitrarily fix a bijective mapping $\psi:\{1,\cdots,l-k\}\rightarrow\{1,\cdots,l\}\setminus\phi(U)$ . For any $1\leq i\leq l-k$ , define flow $\vec{f}_{i}$ such that for any edge $e=\langle r,w\rangle$ ,

\vec{f}_{i}(e)=\begin{cases}-3&\textrm{if }w=w_{\psi(i)}\\ 0&\textrm{otherwise}\end{cases}.

One can check that $\vec{f}_{i}$ satisfies the demand vector $\vec{d}_{i}\circ w_{\psi(i)}$ .

Furthermore, it is easy to see that the $l+2k$ flows form a valid multi-commodity flow. Hence we finishes the proof of Part 1.

Part 2: If $W$ contains no perfect matching, $\lambda^{*}=\lambda_{W}(G;\vec{d}_{1},\cdots,\vec{d}_{l+2k})\leq\frac{6}{7}$ .

Let $v_{u}\in W$ for $u\in U$ , $v_{i}\in W$ for $1\leq i\leq m-n$ be such that there is a valid multi-commodity flow $\vec{f}_{u}$ for $u\in U$ , $\vec{f}_{i}$ for $1\leq i\leq m-n$ satisfying $\lambda^{*}\vec{d}_{u}\circ v_{u}$ for $u\in U$ , $\lambda^{*}\vec{d}_{i}\circ v_{i}$ for $1\leq i\leq m-n$ .

First of all, for any edge $e=\langle r,w\rangle$ , we can observe two facts:

\displaystyle\sum_{u\in U}|\vec{f}_{u}(e)|\geq(3+n+3n^{\prime})\lambda^{*}

(6)

where $n=|\{u:u\in w,v_{u}=w\}|$ and $n^{\prime}=|\{u:u\notin w,v_{u}=w\}|$ , and

\displaystyle\sum_{1\leq i\leq l-k}|\vec{f}_{i}(e)|\geq 3m\lambda^{*}

(7)

where $m=|\{i:1\leq i\leq l-k,v_{i}=w\}|$ . The detailed proof is omitted since the inequalities are immediate results of applying Lemma 3 to $Cut(\{w\})$ .

Then we proceed case by case.

Case 1: $v_{i}=v_{u}$ for some $1\leq i\leq m-n,u\in U$ .

Let $e=\langle r,v_{i}\rangle$ . By (6) and (7), the total congestion on edge $e$ satisfies $\sum_{u\in U}|\vec{f}_{u}(e)|+\sum_{1\leq i\leq l-k}|\vec{f}_{i}(e)|\geq 7\lambda^{*}$ . By capacity constraint on $e$ , we have $\lambda^{*}\leq\frac{6}{7}$ .

Case 2: $v_{i}=v_{j}$ for some $1\leq i\neq j\leq l-k$ .

Let $e=\langle r,v_{i}\rangle$ . By (6) and (7), the total congestion on edge $e$ satisfies $\sum_{u\in U}|\vec{f}_{u}(e)|+\sum_{1\leq i\leq l-k}|\vec{f}_{i}(e)|\geq 9\lambda^{*}$ . By capacity constraint on $e$ , we have $\lambda^{*}\leq\frac{2}{3}$ .

Case 3: there exists $w\in W$ such that $|\{u\in U:v_{u}=w\}|\geq 4$ .

Let $e=\langle r,w\rangle$ . By (6), $\sum_{u\in U}|\vec{f}_{u}(e)||\geq 7\lambda^{*}$ . By capacity constraint on $e$ , we have $\lambda^{*}\leq\frac{6}{7}$ .

The rest of the proof will assume that none of the Cases 1-3 happens. Let $W^{\prime}=\{w\in W:v_{u}=w\text{ for some }u\in U\}$ and $W^{\prime\prime}=\{w\in W:v_{i}=w\text{ for some }1\leq i\neq j\leq l-k\}$ . We have

\displaystyle W^{\prime}\bigcap W^{\prime\prime}=\emptyset,|W^{\prime\prime}|=l-k,|W^{\prime}|\leq k.

(8)

By the pigeon hole principle, one further sees that for any $w\in W^{\prime}$ , $|\{u\in U:v_{u}=w\}|=3$ .

Case 4: there exists $u\in U$ such that $u\notin v_{u}$ .

Let $e=\langle r,v_{u}\rangle$ . Since $|\{u^{\prime}\in U:v_{u^{\prime}}=v_{u}\}|=3$ and $u\notin v_{u}$ , by (6), $\sum_{u\in U}|\vec{f}_{u}(e)||\geq 8\lambda^{*}$ . By capacity constraint on $e$ , we have $\lambda^{*}\leq\frac{3}{4}$ .

Case 5: None of the above cases happens.

Since $u\in v_{u}$ for any $u\in U$ , $|U|\leq|\bigcup_{w\in W^{\prime}}w|$ which is at most $3k$ due to (8). Recall that $|U|=3k$ , so $|\bigcup_{w\in W^{\prime}}w|=3k$ . As a result, $w\bigcap w^{\prime}=\emptyset$ for any $w,w^{\prime}\in W^{\prime}$ , which implies that $W^{\prime}$ is a perfect matching, contradictory to the assumption that $W$ contains no perfect matching. Therefore, Case 5 never happens.

To sum up all the cases, $\lambda^{*}\leq\frac{6}{7}$ . The proof ends. ∎

The following theorem is also a surprise. In the unrestricted case, if all the supply vectors are uni-source (i.e., each having a single source), a trivial optimum solution to LoMuF is choosing the sources themself as targets. However, when targets are restricted to a prescribed sets, LoMuF becomes NP-hard even on uni-source supply vectors and stars (i.e., trees of depth $1$ ).

Theorem 16.

LoMuF with restricted targets is NP-hard on uni-source supply vectors and stars.

Proof.

We prove the theorem via a reduction from 3-partition problem to LoMuF. For this end, given an instance $S=\{s_{1},\cdots,s_{3m}\}$ of 3-partition problem, we set about to construct an instance of LoMuF with restricted targets, including a capacitated star, $3m$ supply vectors, and a candidate set of targets.

Specifically, as illustrated in Figure 7, the capacitated undirected star $G=(V,E,\vec{c})$ consists of the center $r$ and the set $U$ of $m$ leaves $u_{1},\cdots,u_{m}$ . Orient every edge to point to $r$ . Each edge has capacity $B$ , where $B=\frac{\sum_{s\in S}s}{m}$ . For any $1\leq i\leq 3m$ , define supply vector $\vec{d}_{i}\in\mathbb{R}_{-}^{V}$ such that $\vec{d}_{i}(r)=-s_{i}$ and $\vec{d}_{i}(u)=0$ for any $u\in U$ . Appoint $U$ to be the candidate set of targets.

Our proof will be done in two steps.

Step 1. If $S$ has an equi-partition, then $\lambda_{U}(G;\vec{d}_{1},\cdots,\vec{d}_{3m})\geq 1$ .

Let $\{S_{1},\cdots,S_{m}\}$ be an equi-partition of $S$ . For any $1\leq i\leq 3m$ , let $1\leq j\leq m$ satisfy $s_{i}\in S_{j}$ , and we define flow $\vec{f}_{i}$ such that for any $e\in E$ ,

\vec{f}_{i}(e)=\begin{cases}-s_{i}&\textrm{if }e=\langle r,u_{j}\rangle\\ 0&\textrm{otherwise}\end{cases}.

One can check that $\vec{f}_{i}$ satisfies demand vector $\vec{d}_{i}\circ u_{j}$ .

Since $\{S_{1},\cdots,S_{m}\}$ is an equi-partition of $S$ , all these flows form a valid multi-commodity flow. Hence, we have $\lambda_{U}(G;\vec{d}_{1},\cdots,\vec{d}_{3m})\geq 1$ .

Step 2. If $\lambda_{U}(G;\vec{d}_{1},\cdots,\vec{d}_{3m})\geq 1$ , $S$ has an equi-partition.

Let $v_{1},\cdots,v_{3m}\in U$ be such that there is a valid multi-commodity flow $\{\vec{f}_{1},\cdots,\vec{f}_{3m}\}$ satisfying $\vec{d}_{1}\circ v_{1},\cdots,\vec{d}_{3m}\circ v_{3m}$ . For any $1\leq j\leq m$ , let $I_{j}=\{1\leq i\leq 3m:v_{i}=u_{j}\}$ . Applying Lemma 3, we have $|\vec{f}_{i}(\langle r,u_{j}\rangle)|\geq s_{i}$ for any $i\in I_{j}$ . Due to the capacity constraint, one gets $B\geq\sum_{i\in I_{j}}|\vec{f}_{i}(\langle r,u_{j}\rangle)|$ . As a result, $mB\geq\sum_{1\leq j\leq m}\sum_{i\in I_{j}}|\vec{f}_{i}(\langle r,u_{j}\rangle)|\geq\sum_{s\in S}s=mB$ , meaning that $B=\sum_{i\in I_{j}}|\vec{f}_{i}(\langle r,u_{j}\rangle)|$ for any $1\leq j\leq m$ . Hence, $\{S_{j}=\{s_{i}:i\in I_{j}\}:1\leq j\leq m\}$ is an equi-partition of $S$ . ∎

Then we discuss the target location version of the maximum multi-commodity problem. Arbitrarily fix supply vectors $\vec{d}_{i}\in\mathbb{R}_{-}^{V},i\in I$ on a capacitated directed/undirected graph with vertex set $V$ , where $I$ is a finite index set. Roughly speaking, we are to locate targets for the supply vectors so as to maximize the total flow values. In particular, we have to find $v_{i},i\in I$ to maximize $\sum_{i\in I}\lambda_{i}\|\vec{d}_{i}\|_{1}$ , where non-negative reals $\lambda_{i}$ ’s are such that

1.

For any $i\in I$ , there exists a flow $\vec{f}_{i}$ satisfying the demand vector $\lambda_{i}\vec{d}_{i}\circ v_{i}$ , and
2.

$\{\vec{f}_{i}:i\in I\}$ is a valid multi-commodity flow.

It is worth noting that all the preceding results in this paper still hold (and the proofs are also valid), except that we are not sure whether the lower bounds of approximation ratio remain true.

Finally, we investigate the target location version of the maximum feasibility problem (maxf-LoMuF for short). Intuitively, our goal is to locate the targets so as to maximize the number of satisfiable supply vectors. Formally, given a set $S$ of demand vectors on a capacitated network $G$ , its feasibility $\zeta(G;S)$ is defined to be the maximum subset of $S$ that can be simultaneously satisfied, namely, $\zeta(G;S)=\max_{S^{\prime}\subseteq S,\lambda(G;S^{\prime})\geq 1}|S^{\prime}|$ . Given supply vectors $\vec{d}_{i}\in\mathbb{R}_{-}^{V},i\in I$ on a capacitated directed/undirected graph $G$ with vertex set $V$ , the task of maxf-LoMuF is to find $v_{1}\in V,i\in I$ so as to maximize $\zeta(G;\vec{d}_{i}\circ v_{i},i\in I)$ . By abusing notation, the optimum objective value will also be denoted by $\zeta(G;\vec{d}_{i},i\in I)$ .

We will show that maxf-LoMuF is hard to approximate. The proof relies on a reduction from the well-studied maximum independent set problem (MIS) which aims to find a maximum set of vertices that are pairwise non-adjacent in a given graph. Let’s first recall a property of MIS.

Lemma 17 ([11]).

For any constant $\epsilon>0$ , unless NP=ZPP, MIS can not be approximated within $O(n^{1-\epsilon})$ on graphs of $n$ vertices for any constant $\epsilon>0$ .

Theorem 18.

For any constant $\epsilon>0$ , unless NP=ZPP, the maxf-LoMuF problem on $k$ supply vectors cannot be approximated within $O(k^{1-\epsilon})$ in polynomial-time.

Proof.

We prove by reducing MIS to maxf-LoMuF. Namely, given a graph $G=(V,E)$ , we will construct a capacitated graph $G^{\prime}=(V^{\prime},E^{\prime},\vec{c})$ and supply vectors $\vec{d}_{1},\cdots,\vec{d}_{k}\in\mathbb{R}_{-}^{V^{\prime}}$ , where $k=|V|$ .

Specifically, $V^{\prime}=V\bigcup E\bigcup W$ , where $W=\{w_{e}:e\in E\}$ . $E^{\prime}=E^{\prime}_{1}\bigcup E^{\prime}_{2}$ , where $E^{\prime}_{1}=\{\langle v,e\rangle:v\in V,e\in E,v\textrm{ is an end of }e\}$ and $E^{\prime}_{2}=\{\langle e,w_{e}\rangle:e\in E\}$ . Every edge of $G^{\prime}$ has capacity 1. The graph $G^{\prime}$ is illustrated in Figure 8. We choose to orient every edge upward.

For any $1\leq i\leq k$ , define a supply vector $\vec{d}_{i}$ such that for any $v\in V^{\prime}$ ,

\vec{d}_{i}(v)=\begin{cases}-1&\textrm{if }v\in\{v_{i}\}\bigcup\{w_{e}:e\textrm{ is incident to }v_{i}\}\\ 0&\textrm{otherwise}\end{cases}.

Arbitrarily fix a subset $I\subseteq\{1,\cdots,k\}$ . We prove two claims:

•

Claim 1: If $\{v_{i}:i\in I\}$ is an independent set of $G$ , then $\lambda(G^{\prime};\vec{d}_{i},i\in I)\geq 1$ .

Suppose $\{v_{i}:i\in I\}$ is an independent set of $G$ . For any $i\in I$ , define flow $\vec{f}_{i}$ such that for any $e^{\prime}=\langle u,e\rangle\in E^{\prime}$ with $e\in E$ ,

\vec{f}_{i}(e^{\prime})=\begin{cases}1&\textrm{if }u\in\{v_{i},w_{e}\},e\textrm{ is incident to }v_{i}\textrm{ in }G\\ 0&\textrm{otherwise}\end{cases}.

It is easy to check that $\{\vec{f}_{i}:i\in I\}$ is a valid multi-commodity flow satisfying $\{\vec{d}_{i}\circ v_{i}:i\in I\}$ . Hence, $\lambda(G^{\prime};\vec{d}_{i},i\in I)\geq 1$ .

•

Claim 2: If $\lambda(G^{\prime};\vec{d}_{i}:i\in I)\geq 1$ , then $\{v_{i}:i\in I\}$ is an independent set of $G$ .

Assume $\lambda(G^{\prime};\vec{d}_{i}:i\in I)\geq 1$ . Choose $u_{i}\in V^{\prime},i\in I$ such that there is a valid multi-commodity flow $\{\vec{f}_{i}:i\in I\}$ which satisfies $\{\vec{d}_{i}\circ u_{i}:i\in I\}$ .

We set about to show that $\{v_{1},\cdots,v_{y}\}$ is an independent set of $G$ . For contradiction, suppose $i\neq i^{\prime}\in I$ are such that $v_{i}$ and $v_{i}^{\prime}$ are both incident to $e\in E$ in $G$ . By Lemma 3, no matter where $u_{i}$ lies, we always have $|\vec{f}_{i}(\langle e,w_{e}\rangle)|\geq 1$ . Likewise, we also have $|\vec{f}_{i^{\prime}}(\langle e,w_{e}\rangle)|\geq 1$ . Considering that $\vec{c}(\langle e,w_{e}\rangle)=1$ , we reach a contradiction. Claim 2 holds.

By Claims 1 and 2, for any $\alpha$ -approximate solution to the instance of maxf-LoMuF, we can construct an $\alpha$ -approximate solution to the instance of MIS, and vice versa. Then the theorem holds due to Lemma 17. ∎

We have a trivial approximation algorithm for maxf-LoMuF: Given a capacitated graph $G$ and $k$ supply vectors $\vec{d}_{i},1\leq i\leq k$ , by enumerating, find the first $1\leq i\leq k$ and $v\in V$ such that $\lambda(G;\vec{d}_{i}\circ v)\geq 1$ . This algorithm obviously has approximation ratio $k$ , which is nearly optimum due to Theorem 18.

Remark 5.

The above results (including the algorithm) about maxf-LoMuF can be extended to directed graphs. The proofs remain valid up to minor modifications, so detailed proof are omitted here.

6 Conclusion

We formulated the target location problem for multi-commodity flows. It is a natural combination of the classic facility location problem and the multi-commodity flow problem, and extends both. It is interesting in theory and well-rooted in real-world applications.

We mainly study the issue of maximizing concurrent flows, both on directed and undirected networks. It is interesting to see that the directed case makes the problem harder: the problem is efficiently solvable on undirected trees, but NP-hard on di-paths. Another separation is that the problem is efficiently solvable for bi-source supply vectors on undirected graphs, while it is NP-hard for such supply vectors on directed graphs. We have also made progress on algorithm design: in addition to an exact algorithm on trees, an approximation algorithm is proposed for arbitrary undirected graphs, which leads to algorithms on symmetric directed graphs.

As the first step towards this novel direction, there remain numerous open questions. Just mention a few.

1.

Though an $\eta$ -approximation algorithm exists on undirected networks, we know nothing about the lower bound of approximation ratio of the problem. Even whether a PTAS exists remains open. The directed situation is less satisfactory: except a trivial algorithm with approximation ratio $k$ , no non-trivial approximation algorithm on general directed graphs is known.
2.

The variants deserve further studying. Since in many applications, targets can be chosen only from a candidate set, restricted version of our problem is of special interest. Cost minimization is an active topic in classic network flow problems. It can be easily defined in our framework, and is a rich research direction. One more variant has not yet been mentioned: This paper allows choosing just one target for each commodity, but what if more targets can be selected?
3.

Online versions of our problem are also well motivated. Recall the scenario of geo-distributed data analysis. The typical case is that the applications arrive sequentially in an online fashion, rather than all at once as we mentioned before. The online fashion poses special challenges in algorithm design. We are even not sure whether an algorithm exists with guaranteed competitive ratio.

Acknowledgement

We are grateful to the anonymous referees for detailed corrections and suggestions. We also thank Prof. Yungang Bao for his encouragement and support. Special thanks go to Dr. Laiping Zhao, who helped the authors formulating the problem.

References

[1] H. An, M. Singh, and O. Svensson. Lp-based algorithms for capacitated facility location. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pages 256–265, 2014.
[2] K. Andreev, C. Garrod, D. Golovin, B. M. Maggs, and A. Meyerson. Simultaneous source location. Acm Transactions on Algorithms, 6(1):Article No. 16, 2009.
[3] K. Arata, S. Iwata, K. Makino, and S. Fujishige. Locating sources to meet flow demands in undirected networks. Journal of Algorithms, 42(1):54–68, 2002.
[4] A. BernáTh. Source location in undirected and directed hypergraphs. Oper. Res. Lett., 36(3):355–360, 2008.
[5] C. Chekuri, S. Khanna, and F. B. Shepherd. The all-or-nothing multicommodity flow problem. SIAM Journal on Computing, 42(4):1467–1493, 2013.
[6] P. Christiano, J. Kelner, A. Madry, D. Spielman, and S.-H. Teng. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. Proceedings of the Annual ACM Symposium on Theory of Computing, page 273–282, June 2011.
[7] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms. MIT press, 2009.
[8] M. R. Gary and D. S. Johnson. Computers and intractability: A guide to the theory of np-completeness, 1979.
[9] S. Guha and S. Khuller. Greedy strikes back: Improved facility location algorithms. Journal of Algorithms, 31(1):228–248.
[10] S. L. Hakimi. Optimum locations of switching centers and the absolute centers and medians of a graph. Operations Research, 12(3):450–459, june 1964.
[11] J. Hastad. Clique is hard to approximate within $n^{1-\epsilon}$ . Acta Mathematica, 182(1):105–142, 1996.
[12] P. Hebler and H. W. Hamacher. Sink location to find optimal shelters in evacuation planning. EURO Journal on Computational Optimization, 4(3):325–347, 2016.
[13] H. Ito, M. Paterson, and K. Sugihara. The multi-commodity source location problems and the price of greed. Journal of Graph Algorithms and Applications, 13(1):55–73, 2009.
[14] J. A. Kelner, Y. T. Lee, L. Orecchia, and A. Sidford. An almost-linear-time algorithm for approximate max flow in undirected graphs and its multicommodity generalizations. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), page 217–226, USA, 2014. Society for Industrial and Applied Mathematics.
[15] J. A. Kelner, G. L. Miller, and R. Peng. Faster approximate multicommodity flow using quadratically coupled flows. In Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing, page 1–18, New York, NY, USA, 2012. Association for Computing Machinery.
[16] S. Khuller. Data-Aware Scheduling in Datacenters. PhD thesis, University of Maryland (College Park, Md.), 2016.
[17] P. Kolman and C. Scheideler. Improved bounds for the unsplittable flow problem. In Thirteenth Acm-siam Symposium on Discrete Algorithms, 2002.
[18] G. Kortsarz and Z. Nutov. A note on two source location problems. Journal of Discrete Algorithms, 6(3):520–525, sep 2008.
[19] R. Krauthgamer, J. R. Lee, and H. Rika. Flow-cut gaps and face covers in planar graphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 525–534. SIAM, 2019.
[20] M. Labbe, D. Peeters, and J.-F. Thisse. Location on networks. Network Routing, 8, 02 1992.
[21] Y. T. Lee, S. Rao, and N. Srivastava. A new approach to computing maximum flows using electrical flows. In Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing, page 755–764, 2013.
[22] S. Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. In Automata, Languages & Programming-international Colloquium, Icalp, Zurich, Switzerland, July, Part II, pages 45–58.
[23] A. Madry. Computing maximum flow with augmenting electrical flows. In Proceedings of the IEEE 57th Annual Symposium on Foundations of Computer Science, pages 593–602, 2016.
[24] S. Mamada, K. Makino, and S. Fujishige. Optimal sink location problem for dynamic flows in a tree network. Ieice Transactions on Fundamentals of Electronics Communications and Computer Sciences, E85-A(5):1020–1025, 2002.
[25] S. Mamada, T. Uno, K. Makino, and S. Fujishige. An algorithm for the optimal sink location problem in dynamic tree networks. Discrete Applied Mathematics, 154(16):2387–2401, 2006.
[26] L. Monis, B. Kunjumon, and N. Guruprasad. Implementation of Maximum Flow Algorithm in an Undirected Network, pages 195–202. Springer Singapore, Singapore, 01 2019.
[27] M. T. Y. J. Naga, U. S., N. A., and G. S. A technical survey on optimization of processing geo-distributed data. Journal of Physics: Conference Series, 1000:012140, 2018.
[28] R. Peng. Approximate undirected maximum flows in o(mpolylog(n)) time. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, page 1862–1867, 2016.
[29] Q. Pu, G. Ananthanarayanan, P. Bodik, S. Kandula, A. Akella, P. Bahl, and I. Stoica. Low latency geo-distributed data analytics. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM ’15, page 421–434, 2015.
[30] M. Sakashita, K. Makino, and S. Fujishige. Minimum cost source location problems with flow requirements. Algorithmica, 50(4):555–583, 2006.
[31] A. Salmasi, A. Sidiropoulos, and V. Sridhar. On constant multi-commodity flow-cut gaps for families of directed minor-free graphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 535–553. SIAM, 2019.
[32] F. B. Shepherd, A. Vetta, and G. T. Wilfong. Polylogarithmic approximations for the capacitated single-sink confluent flow problem. In Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science, pages 748–758, 2015.
[33] J. Sherman. Area-convexity, $l_{\infty}$ regularization, and undirected multicommodity flow. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 452–460, 2017.
[34] D. B. Shmoys, va Tardos, and K. Aardal. Approximation algorithms for facility location problems (extended abstract). In Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, El Paso, Texas, USA, May 4-6, 1997, 1997.
[35] H. Tamura, M. Sengoku, S. Shinoda, and T. Abe. Location problems on undirected flow networks. IEICE Trans., E73:1989–1993, 1990.
[36] B. C. Tansel, R. L. Francis, T. J. Lowe, M. Science, and N. Apr. Location on networks : A survey . part ii : Exploiting tree network structure. Management Science, 29(4):498–511, 1983.
[37] V. V. Vazirani. Approximation Algorithms. Springer, 2001.
[38] L. Yin, J. Sun, L. Zhao, C. Cui, J. Xiao, and C. Yu. Joint scheduling of data and computation in geo-distributed cloud systems. In 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pages 657–666, 2015.
[39] L. Yue, L. Zhao, C. Cui, and C. Yu. Fast big data analysis in geo-distributed cloud. In IEEE International Conference on Cluster Computing, 2016.
[40] L. Zhao, Y. Yang, A. Munir, A. X. Liu, Y. Li, and W. Qu. Optimizing geo-distributed data analytics with coordinated task scheduling and routing. IEEE Transactions on Parallel and Distributed Systems, 31(2):279–293, 2020.