Target Location Problem for Multi-commodity Flow
Abstract
Motivated by scheduling in Geo-distributed data analysis, we propose a target location problem for multi-commodity flow (LoMuF for short). Given commodities to be sent from their resources, LoMuF aims at locating their targets so that the multi-commodity flow is optimized in some sense. LoMuF is a combination of two fundamental problems, namely, the facility location problem and the network flow problem. We study the hardness and algorithmic issues of the problem in various settings. The findings lie in three aspects. First, a series of NP-hardness and APX-hardness results are obtained, uncovering the inherent difficulty in solving this problem. Second, we propose an approximation algorithm for general undirected networks and an exact algorithm for undirected trees, which naturally induce efficient approximation algorithms on directed networks. Third, we observe separations between directed networks and undirected ones, indicating that imposing direction on edges makes the problem strictly harder. These results show the richness of the problem and pave the way to further studies.
1 Introduction
Nowadays, data is generated geo-distributively at a much higher speed as compared to the existing data transfer speed; for instance, telescopes around the world bring us an unimaginable amount of astronomy data. There are two main reasons for having geo-distributed data: (1) Datacenters (DCs) are built across the globe. (2) Organizations prefer to use multiple clouds to increase reliability, security, and processing. Besides, there exist applications that process and analyze a huge amount of massively geo-distributed data to extract useful information. A typical scenario in processing geo-distributed data is that several analysis tasks are running simultaneously, and each requires a fraction of the collected data [27, 29, 39, 38, 16]. In addition, every analysis task moves needed data to a single location before the computation. Fig. 1 shows an example of geo-distributed telescope data.
The network bandwidth is a crucial factor in geo-distributed data movement and becomes the resource bottleneck. For example, the demand for bandwidth increased from 60 to 290 Tbps between the years 2011 and 2015 while the network capacity growth was not proportional. In 2015, the network capacity growth was only 40 percent, which was the lowest during the years 2011 and 2014222https://www.telegeography.com/researchservices/global-bandwidth-research-service/. When applications (such as electromagnetic radiation and infrared ray analysis), each handling data from some datacenters, have to be deployed, there is no meaningful notion of distance. The latency (travel time of a single small packet) under low-congestion conditions tends not to be noticeable to the end-users. The real difficulty here is the underlying capacity of the network. If links become congested, then the latency will increase and throughput will suffer. A key issue is how to allocate enough bandwidth to each application without causing congestion on the network [40].
Hence, we need to choose proper locations for the tasks to reduce congestion. Specifically, we propose this target location problem for multi-commodity flow: Given sources of multiple commodities on a capacitated network, the goal is to locate the targets to maximize the flow value.
We fix sources because, as our motivating example of geo-distributed data analysis shows, it is difficult, if not impossible, to change datacenters that collect data since for efficiency as these datacenters should be close to data generators. However, it is much more flexible to choose the target locations where the analysis tasks are performed.
The multi-commodity flow problem (MCF) is one of the most fundamental problems with a wide variety of scientific and engineering applications that have been studied intensively [32, 26]. In the most typical scenario, a finite number of commodities have to be sent from their sources to targets on a capacitated network. Each commodity has its own flow, and the commodities interact when their flows compete for capacity on common edges.
There are two general classes of MCF. One is network analysis which, based on a given network configuration, finds the optimal flow pattern for some objective function. The most studied objective functions include maximizing flow values and minimizing flow costs. The other belongs to network synthesis which seeks an optimal network configuration satisfying certain requirements.
In both classes, the targets of the commodities are taken for granted. To our surprise, researches have long neglected how targets are chosen. This paper is devoted to initializing such a theory.

Our proposed problem extends the MCF framework. It does not belong to either of the two classes. It is a combination of the facility location problem and the network flow problem which are inherently related [1]. Facility location is a branch of operations research related to locating or positioning at least a new facility among several existing facilities to optimize (minimize or maximize) at least one objective function. It is among the most fundamental problems in operations research and theoretical computer science [37]. Facility location met network flow in 1990 [35] and has inspired a series of work [13, 3, 18, 2, 12]. However, all the published works minimize costs of the selected sources or targets (not the flow cost which, together with flow value, is the objective of network flow problems), and never consider the multi-commodity setting. The most crucial difference lies in that our combination is inherent, meaning that the objective is to optimize the flow value, but the literature focuses on the cost of the selected nodes rather than the cost of flow. Another benefit of our framework is that it can naturally extend almost every network flow problem, e.g., flow cost minimization.
Our model has other applications, such as Web server deployment. There are serving various demands from widely-distributed users, for example, requesting for different online video, where should the servers be located so that the users have a good experience? Again we do not care distance, and the key objective is to optimize the available bandwidth. There are more motivating examples, say, network-flow based evacuation planning for an emergency where shelters have to be selected, and congestion decides the efficiency of evacuation. Interested readers are referred to [12].
These real-world scenarios justify our problem’s critical features: The commodities are only partially determined since the targets are not given and have themselves to be optimized, and a decisive factor of the optimization is the bandwidth rather than any notion of distance. This well motivates our problem.
1.1 Results and Discussion
We propose a novel model of the target location for multi-commodity flow (LoMuF). On the one hand, we figure out the hardness results of various versions. On the other hand, we design algorithms for several versions. The results are as follows.
-
1.
We show that the LoMuF problem is NP-hard on general undirected graphs.
We know that if the targets are fixed, the problem degenerates to one normal multi-commodity flow problem (allowing fractional flows) and becomes tractable in polynomial time, which shows that the most challenging part of this problem is indeed how to locate the targets.
-
2.
We design a polynomial-time algorithm solving LoMuF on trees.
Trees are important network structures in practice. Compared to the NP-hardness result above, the fact that there is only one path connecting a source and the target on a tree simplifies the problem. Our algorithm is elegant and surprisingly shows that the interaction between different commodities becomes not harmful on trees.
-
3.
We present a -approximation algorithm for LoMuF on general undirected graphs, where is the largest source number among all commodities.
This result actually shows that, when (the so-called bi-source cases) the problem can also be solved efficiently, but (take into account the NP-hardness result above) becomes intractable when .
-
4.
For LoMuF on directed graphs (Di-LoMuF), we prove that it is also NP-hard and even cannot be efficiently approximated with a ratio less than 2.
In fact, we also show that LoMuF on undirected graphs can be reduced to the directed case Di-LoMuF, and then Di-LoMuF should be even harder.
-
5.
Di-LoMuF also remains NP-hard on symmetric di-paths and bi-source supply vectors.
These are clear separations between undirected LoMuF and Di-LoMuF, since undirected LoMuF is efficiently solvable on trees while Di-LoMuF is even difficult on paths, a very special case of trees. As we pointed out above, the bi-source instances are easy for undirected LoMuF, but not for Di-LoMuF.
-
6.
For the special case on symmetric di-trees, Di-LoMuF has a polynomial-time 2-approximation algorithm.
Though we have seen several hardness results of Di-LoMuF, for a special but still meaningful subset, where every link has the same capability of downloading and uploading, we can obtain an efficient approximation algorithm.
-
7.
We show that our results above can also be extended to other variants of LoMuF such as maximum sum flows, unsplittable flows, restricted candidate targets, maximum feasible flows, and so on. For the unsplittable version, we show that it cannot be approximated within ratio 2. For the version with restrictions on targets, it is NP-hard on uni-source supply vectors and stars and cannot be efficiently approximated within ratio on trees. For the maximum feasible flows version, we prove that for any constant , unless NP=ZPP, it cannot be approximated within on supply vectors.
This shows that the framework of the new location problems has a powerful capability of modeling different scenarios in practice and enriches the theory of location problems and network flow problems.
1.2 Related Work
There is an increasing vast literature on multi-commodity flow and its single-commodity special case [32]. Basically, there are two types of optimization objectives, namely, minimum cost and maximum flow which is the focus of this paper. The main theme of maximum flow in recent years is improving the efficiency of approximation algorithms [26, 14, 6, 21, 28, 23, 15, 33, 5]. The flow-cut duality is also a challenging issue and has attracted much attention from researchers [19, 31].
Facility location has flourished ever since the 1960s and remains an active topic in operations research and theoretical computer science [10, 20, 37, 20]. Though generally, no constant-ratio approximation algorithm exists, it can be constant-approximated on metric spaces. One of the main threads of research is to improve the approximation ratio in various situations [34, 22, 9].
Though inherently related to multi-commodity flow[1], facility location got to be combined with network flow only in 1990 [35], when the source location problem was proposed. Roughly speaking, the mission of the source location problem on a network is to find a set of sources from which enough flow can be sent to each prescribed target. In addition to flow requirements, connectivity and vertex coverage are also frequently used constraints. Work in this line can be classified into two categories. One is independent source location, meaning that the flows to different targets do not interact [3, 30, 4, 24, 25, 13, 3, 18]. The other is simultaneous source location, where the flows concurrently exist and interact by competing edge capacities [2, 12]. An interesting application is emergency evacuation planning [12], where shelters are to be located where residents in a disaster can move to as fast as possible. In such applications, capacities are also usually imposed on network nodes, rather than just on edges in typical network flow models. All the mentioned works have two common features. First, essentially only a single commodity is considered which is multi-source multi-target. Second, the objective is to optimize some measures of the selected sources (say, total cost), rather than the properties of the flow (say, flow value). This is in sharp contrast to our proposed problem.
2 Preliminaries and Problem Statement
In this section, we review key notions and notations used in this paper, and formally define the location problem.
2.1 Preliminaries
Let (, respectively) represent the set of (non-negative, non-positive, respectively) real numbers. We use for a vector, and for its -th entry. When we denote a set by an upper-case letter, we usually write the corresponding (subscripted) lower-case letter for the members.
A network is a capacitated graph , where is the vertex set, is the edge set, and assigns capacities to the edges. We first mainly focus on undirected graphs in this paper, and will consider directed graphs in Section 4. For any , we use , or interchangeably, to denote the edge between . A commodity is described by a demand vector satisfying , where any such that (, respectively) is called a source (a target, respectively). Intuitively, each source has to sent out units of the commodity, and in total units are delivered to target . The vertex set of a graph is denoted by .
To specify flows over a network, we always arbitrarily orient all the edges and keep the orientation implicit unless necessary. For any , let (, respectively) stands for the set of incoming (outgoing, respectively) edges. A flow is a vector , which for any edge , means units of transportation along in orientation if , and opposite direction otherwise. Given flows , we write if for any . A flow is said to satisfy a demand vector , if for any , . A multi-commodity flow, which means a set of flows, is valid if its congestion along any edge is at most .
The maximum concurrent problem (MCF for short) has been extensively and is still being actively studied. Specifically, given demand vectors on a capacitated graph , the mission of MCF is to find the maximum such that can be satisfied by a valid multi-commodity flow on . The optimum will be denoted by .
Let’s recall some properties of MCF.
Lemma 1.
MCF lies in P.
As mentioned in [7, page 863], there is no known purely combinatorial algorithm solving MCF exactly and efficiently. The only commonly used algorithm is based on linear programming.
A multi-commodity flow on a capacitated graph is said to be a decomposition of flow , if and for any edge of .
Lemma 2.
Arbitrarily fix a demand vector on a capacitated graph . Suppose has exactly one target . Then any flow satisfying can be decomposed into a multi-commodity flow which satisfies the demand vectors . Here, each is such that for any vertex of ,
Note that the decomposition in Lemma 2 is not necessarily unique. Any such one will be called a canonical decomposition of the flow.
Given any vertex subset of an graph , the cut induced by , denoted by , is defined to be the set of edges bridging and . Let be the set of edges coming into , and .
Lemma 3.
Suppose that is a flow satisfying a demand vector on a capacitated graph . Then for any , .
2.2 Target Location problem
Intuitively, our goal is to properly locate targets for multiple commodities. We formulate this problem in this subsection.
Given a capacitated graph , any is called a supply vector on . For any supply vector and , we define a demand vector such that for any ,
It is time to formulate the problem of target location for maximizing concurrent multi-commodity flow, LoMuF for short. Given supply vectors on a capacitated graph , LoMuF aims at finding such that is maximized. By abuse of the notation, the optimum objective value is again denoted by .
3 Hardness and Algorithms of LoMuF
We begin with studying the hardness of LoMuF. Our work refers to a well-known NP-complete problem, 3-dimensional matching (3-DM for short). Though LoMuF is NP-hard in general, we devise an algorithm solving LoMuF problems on trees efficiently, and show that a simple strategy could be a not-bad solution for graphs with bounded sources.
3.1 Hardness Result
A 3-DM instance is a quadruple , where are pairwise disjoint finite sets of equal size, and . The goal is to decide whether contains a perfect matching, namely, a subset such that and ? The trivial cases where will not be considered.
We first show that LoMuF is NP-hard, which is more or less a surprise, compared with Lemma 1.
Theorem 4.
Given supply vectors on a capacitated graph , it is NP-complete to decide whether .
Proof.
Choose a target for supply vectors , for any . Due to Lemma 1, we can use as a certificate to check whether . This means that the decision problem lies in NP.
To prove NP-completeness, it suffices to establish a reduction from 3-DM.

Given a 3-DM instance with and , we construct an capacitated graph as illustrated in Figure 2. Specifically, consists of three subgraphs connected via . is a complete bipartite graph of vertex sets and , and any is adjacent to if and only if , likewise for . All the edges are oriented upward in Figure 2.
As to the capacity, let be the set of red edges, namely, those incident to or . For any with , let . Then for any ,
We define supply vectors such that for any ,
The rest of the proof is devoted to showing that has a perfect matching if and only if the LoMuF instance satisfies , which will lead to NP-completeness of our decision problem. The proof consists of two parts.
Part 1: a perfect matching in implies .
Without loss of generality, suppose is a perfect matching. For any , define flow such that for any edge ,
For any , define flow such that for any edge ,
It is straightforward to check that the multi-commodity flow is valid and satisfies the demand vectors . Hence, .
Part 2: implies a perfect matching in .
Suppose the optimum targets are , and the multi-commodity flow is valid and satisfies . Part 2 immediately follows from the two facts:
-
1.
Fact 1: for any .
Consider the congestion of any on . Let’s proceed case by case.
-
•
. Applying Lemma 3 to , we see that the congestion of on is at least 3.
-
•
. Without loss of generality, assume . Applying Lemma 3 to , we see that the congestion of on is at least 2, and those on and are both at least 1. Hence, the congestion of on is at least 4.
Since the total capacity of is which upper-bounds the total congestion, we get Fact 1.
-
•
-
2.
Fact 2: the sets are pairwise disjoint.
For contradiction, suppose without loss of generality that . Applying Lemma 3 to multi-commodity flow and command vectors , we have . This implies that for any . Namely, each edge in is full of upward flow. Likewise, each edge in is also full of upward flow. Let . Then we have for any edge in , since flow along such an edge can’t reach or . This, together with the precondition that satisfies , implies . Likewise, . A contradiction is reached since .
∎
3.2 LoMuF on Trees
Theorem 4 indicates that LoMuF is hard to solve on general graphs, but does not exclude the possibility of an efficient algorithm solving LoMuF for some important special case. Indeed, LoMuF on trees allows a fast algorithm, as presented in Algorithm 1. Actually, networks with tree structure is the also the center of related literature [30, 36, 5, 24, 25, 13, 2].
Without loss of generality, trees will be arbitrarily rooted, so the concepts of ancestors, descendants, and subtrees are well defined as usual. Given vertices of a tree, we write if is a descendant of , and if or .
Let’s begin with a polynomial-time algorithm, which turns out to exactly solve LoMuF on trees.
Input: a capacitated tree , supply vectors
Output:
Theorem 5.
The output of Algorithm 1 is an optimum solution to LoMuF on trees.
Proof.
Given a capacitated tree and supply vectors , let be the output of Algorithm 1. Orient any edge of upward, i.e., from a vertex to its parent. The theorem is proven in two steps.
Step 1: Arbitrarily fix . We claim that for any , any , and any flow satisfying , there is a flow which satisfies .
The claim is proved by induction on the hop distance (i.e., the number of edges) between and , denoted by .
Basis: The claim trivially holds when .
Hypothesis: The claim holds when .
Induction: . Let be the lowest common ancestor of the sources of . We proceed case by case.
Case 1: .
If , set flow such that for any edge ,
One can easily check that and satisfies . Let .
If , it must happen that at the beginning of some “while loop” of Algorithm 1 when handling . That loop must assign to , where is the child of satisfying the condition in Line 3. Note that lies on the path between and the final . Set flow such that for any edge ,
By Lemma 3, we see that . Then the condition in Line 3 implies . Furthermore, one can check that satisfies . Let .
Case 2: . Let be the child of such that . Let be the edges on the path between and . Define flow such that for any edge ,
Since Algorithm 1 outputs rather than for , it must hold that
(1) |
For any edge with , we have
Hence, . One can also check that satisfies . Let .
Case 3: neither nor . Let be the lowest common ancestor of and . We have either or .
If , define flow such that for any edge ,
Then and satisfies . Let .
If , lies on the path between and . Hence, it must happen that at the beginning of some “while loop” of Algorithm 1 when handling . Then that loop does not choose the subtree of containing . Follow the argument of Case 2, there is a flow which satisfies the demand vector . Let .
Altogether, we always have a flow which satisfies the demand vector . Because , we apply the induction hypothesis and finish step 1.
Step 2: Let . Choose such that there is a valid multi-commodity flow satisfying . For any , apply the claim in step 1 to and , resulting in a flow which satisfies . Therefore, we get a valid multi-commodity flow satisfying . This means that the output of Algorithm 1 is an optimum solution to LoMuF. ∎
3.3 Approximation Algorithm on General Graphs
Theorem 5 suggests that LoMuF is not extremely intractable, at least in a special case. Fortunately, the tractability can be extended to more general graphs, in the sense of approximation. Let’s begin with a lemma, which shows the important role of master sources (defined below) in approximating LoMuF.
Arbitrarily fix a supply vector on a capacitated graph . Arbitrarily choose , where is the set of sources of . Let be a master source of , namely .
Lemma 6.
For any and flow satisfying , there is a flow which satisfies .
Proof.
We proceed case by case.
Case 1: . For any , define demand vector such that for any ,
By Lemma 2, has a decomposition satisfying .
Now for any , define flow such that for any ,
and demand vector such that for any ,
Our task is reduced to establishing three claims.
Claim 1: for any , satisfies .
It suffices to show for any , where , which is the net incoming of flow at vertex . Obviously, is linear in .
Arbitrarily fix . By definition of ,
Claim 2: satisfies . It immediately follows from Claim 1.
Claim 3: .
It holds because for any ,
The proof of Case 1 finishes.
Case 2: . The lemma trivially holds.
Case 3: .
The proof of Case 1 almost works, except that is not well-defined and the decomposition of does not include . As a result, we still apply the proof of Case 1, after defining and to be all-zero vectors. ∎
Remark 1.
Lemma 6 remains true if is replaced by .
Algorithm 2 is a simple algorithm for LoMuF with guaranteed approximation ratio.
Input: a capacitated graph , supply vectors
Output:
Theorem 7.
Algorithm 2 is -approximate, where .
Proof.
Arbitrarily fix a capacitated graph and supply vectors as input to Algorithm 2. Let be the output. If , each the unique source of , which is trivially optimum. Hence, we assume and show that establish approximation ratio .
Let . Suppose is an optimum solution to LoMuF. This means that there is a multi-commodity flow satisfying .
For any , apply Lemma 6 with , getting a flow which satisfies . As a result, we find a valid multi-commodity flow satisfying , so . The proof ends. ∎
Remark 2.
Note that in Remark 2, if the -entry dominates for any , namely . A special such case is when every supply vector has no more than 2 sources. Then by Remark 2, we immediately have the following corollary.
Corollary 8.
When every supply vector has a dominant entry, Algorithm 2 exactly solves LoMuF.
4 Hardness and Algorithms of Di-LoMuF
In this section, we adapt LoMuF to networks modeled as directed graphs. Such networks have also been studied in the network flow community and frequently appear in nowadays practice. For example, only down-streaming traffics are allowed by many data servers.
We adopt the notation and concepts in Section 2 in case of no ambiguity, with three exceptions:
-
•
Every edge has an inherent direction and is called an arc. An arc from vertex to vertex is denoted by . We usually use to represent a capacitated directed with vertex set , arc set , and capacity vector . Accordingly, (, respectively) stands for the set of incoming (outgoing, respectively) arcs at vertex . Likewise, define and for vertex subset .
-
•
Any arc only allows a flow in the inherent direction, so we can naturally specify a network flow using a non-negative vector .
-
•
We continue to study the problem of target location for maximizing concurrent multi-commodity flow, but in the context of directed graphs. The problem will be called Di-LoMuF to highlight the directed model.
The following theorem indicates the strong relation between LoMuF and Di-LoMuF.
Theorem 9.
LoMuF is reducible to Di-LoMuF.
Proof.
Arbitrarily fix an capacitated graph and supply vectors . We will construct a capacitated direct graph and supply vectors , and prove that the construction preserves the quality of solutions.
Step 1: Construct and the supply vectors.
The directed graph is obtained by replacing any edge of with the diamond gadget as illustrated in Figure 3. Specifically, , , and for any arc in the diamond corresponding to edge , . For any , define such that for any ,

Step 2: Prove that for any , .
Consider any and any valid multi-commodity flow satisfying . For any , define flow as follows: for any , if is from to , set , otherwise set ; for any other arc . It is straightforward to check that the multi-commodity flow is valid and satisfies
Step 3: Prove that for any , there are such that .
Consider any and any valid multi-commodity flow satisfying . For any , define flow as follows. For any oriented from to , we deal case by case:
-
•
When , set .
-
•
When , set if , otherwise .
Now for any , we find a proper . This is also done case by case:
-
•
When , let .
-
•
When for , let if , otherwise .
Again, it is easy to check that the multi-commodity flow is valid and satisfies . ∎
Remark 3.
We further show that Di-LoMuF has no PTAS.
Theorem 10.
Unless P=NP, Di-LoMuF cannot be efficiently approximated with a ratio smaller than 2.
Proof.
We establish a reduction from 3-DM to Di-LoMuF and show that the solutions to Di-LoMuF has a big gap indicating whether or not a perfect matching exists.
Arbitrarily fix an instance of 3-DM. We construct an capacitated directed graph and supply vectors . Specifically, as illustrated in Figure 4, is adapted from the undirected graph in Figure 2, up to two modifications:
-
•
The red parts, namely, vertices and their incident edges, are removed.
-
•
All the arcs are directed upward, as indicated by the arrows.
All the arcs has capacity 1. Define supply vectors such that for any ,

Our theorem immediately holds if we have the following two facts:
Fact 1: If contains a perfect matching, .
To prove this fact, suppose without loss of generality that is a perfect matching in . For any , assume with , and define a flow such that for any arc ,
It is straightforward to check that the multi-commodity flow is valid and satisfies . Hence, .
Fact 2: If contains no perfect matching, .
Let be such that . One immediately sees that for any , unless . Without loss of generality, assume that for any .
Let be a valid multi-commodity flow that satisfies .
Since is not a perfect matching, there must be such that . Again without loss of generality, assume that and . For any , one can observe that for any , because a flow on such an arc can not reach or .
Then by Lemma 3, where . Considering that , we have . ∎
To investigate the borderline of the intractability of Di-LoMuF, one might impose restrictions on instances to make them simple. One dimension of simplification is to upper bound the source number of the supply vectors. When every supply vector has only one source, the sources altogether form a trivial optimum solution to Di-LoMuF. Hence it is reasonable to focus on bi-source supply vectors, namely those each having at most two sources. Another dimension of simplification is to focus on simple graphs, so directed trees (called di-trees) are natural candidates. A di-tree is a directed graph which, after removing the directions of the arcs and neglecting multi-edges, becomes an undirected tree. A di-path can be defined likewise. To make our result as strong as possible, we further require that the di-trees are symmetric. A capacitated directed graph is called symmetric, if (1) all arcs have equal capacity, and (2) once having an arc , it also has the twin arc . We will show that Di-LoMuF remains hard even on these nearly trivial instances.
Before continuing, recall the 3-partition problem, which is well-known to be strongly NP-hard [8, page 99]. An instance of the 3-partition problem is a multi-set of positive integers with for some integer . The objective is to decide whether has an equi-partition, namely a partition of such that for any .
Theorem 11.
Di-LoMuF is NP-hard on symmetric di-paths and bi-source supply vectors
Proof.
We prove the theorem via a reduction from 3-partition problems to Di-LoMuF. For this end, given an instance of 3-partition problem, we set about to construct a symmetric di-path and bi-source supply vectors.

Specifically, as illustrated in Figure 5, the di-path consists of vertices and arcs and for any . Each arc has capacity , where . For any and , define supply vectors such that for any ,
For notational simplicity, we sometimes use to stand for , respectively.
Our proof will be done in two steps.
Step 1. If has an equi-partition, then .
Let be an equi-partition of . For any , let satisfy , and we define flow such that for any ,
One can check that satisfies demand vector .
For any , define flows such that for any ,
Obviously, satisfies , and satisfies .
It is straightforward to check that all these flows form a valid multi-commodity flow. Altogether, we have .
Step 2. If , has an equi-partition.
Suppose are such that there is a valid multi-commodity flow satisfying demand vectors .
For any , let . We proceed in two substeps.
Step 2.1. For any , and .
Arbitrary fix . For contradiction, assume that . Applying Lemma 3 to , one get , contradictory to the fact that . Hence, . Likewise, one can further show that . As a result, .
In a similar way, we also have .
Step 2.2. has an equi-partition.
Arbitrarily fix . Applying Lemma 3 to and to respectively, one gets
(2) |
Let . For any , apply Lemma 3 to , and we have
(3) |
Likewise, for any , applying Lemma 3 to results in
(4) |
Then,
(5) |
Let and . For any , define , which satisfies . This means that is an equi-partition of .∎
Remark 4.
Recall Corollary 8 which implies the tractability of LoMuF on bi-source supply vectors. It is in sharp contrast to the intractability of Di-LoMuF in this situation. Furthermore, Theorem 5 claims that LoMuF is polynomial-time solvable when the input graph is a tree, but Di-LoMuF remains NP-hard even on symmetric di-paths. These serves as an evidence that LoMuF is generally harder than LoMuF.
We have seen the hardness of Di-LoMuF even in the nearly-trivial cases. Fortunately, the next theorem will relieve us from frustration, because it indicates the possibility to approximately solve Di-LoMuF. A new definition is needed.
Given a capacitated directed graph , for any , let . Define the induced graph of to be the capacitated undirected graph , where , and for any , . Intuitively, is obtained from by neglecting the direction of the arcs and merging the capacities of twin arcs if any.
Theorem 12.
Di-LoMuF has a polynomial-time 2-approximation algorithm on symmetric di-trees.
Proof.
Arbitrarily fix a symmetric di-tree and supply vectors . Let be the induced graph of and arbitrarily orient the edges. Suppose be the output of Algorithm 1 when the input is . We set about to prove that is a 2-approximate solution to Di-LoMuF on the instance .
Let and . Our task is reduced to proving two claims.
Claim 1. .
Let be a valid multi-commodity flow satisfying , where .
For any , define flow such that for any arc ,
It is straightforward to check that is a valid multi-commodity flow on that satisfies . Hence, Claim 1 holds.
Claim 2. .
Let be such that there is a valid multi-commodity flow on which satisfies . For any , define flow on as follows: for any edge , if it is oriented from to in , set . Roughly speaking, each is obtained from by merging traffics on twin arcs.
Again, it is easy to check that is a valid multi-commodity flow on that satisfies . This immediately leads to Claim 2.
Combining Claims 1 and 2, we have , which means that is a 2-approximate solution to Di-LoMuF on the instance . ∎
Theorem 12 can be extended to general symmetric directed graphs. Recall the concept concentration defined in Remark 2.
Corollary 13.
Di-LoMuF has a polynomial-time -approximation algorithm on symmetric directed graphs, where the concentration of the supply vectors.
5 Other Variants of LoMuF
We continue to handle other variants of LoMuF, which are defined by extending LoMuF in three dimensions:
- 1.
-
2.
Different solution constraints. We restrict the targets to be chosen from a candidate set, rather than from the entire vertex set. This properly models the practical situation where applications can be deployed to prescribed servers. Such a restricted version of LoMuF is called restricted-LoMuF.
-
3.
Different optimization goals. The network flow community typically serves three optimization goals: concurrent flow value which proportionately maximizes the flows, total flow value which maximizes the summation of all flows, and feasibility which maximize the number of feasible flows. Since concurrent flow value has been elaborated on in the previous sections, this section will investigate the latter two.
Now we begin to present some results of the variants.
Unsplittable flow: A flow is unsplittable if it can be decomposed into flow paths each of which corresponds to the flow from one source to the target and the correspondence is one-to-one. By a flow path, we mean a flow which has non-zero congestion only along a path, and we say that a flow path passes an edge if the flow has non-zero congestion on the edge.
Since on trees there is a unique path connecting any two vertices, flows on trees are intrinsically unsplittable. Consequently, by Theorem 5, even under the unsplittable flow model, LoMuF on trees is polynomial-time solvable. Actually, all the results in the previous sections remain true under the unsplittable flow model, since all the flows in the proofs are unsplittable. Moreover, stronger results can be obtained. See the following theorem as an example.
Theorem 14.
Under the unsplittable flow model, LoMuF is NP-hard and cannot be approximated within ratio 2 in polynomial time.
Proof.
Roughly speaking, we reduce 3-DM to LoMuF, and show that the solutions to LoMuF has a big gap of unsplittable flows indicating whether or not a perfect matching exists.
Basically, we follow the proof of Theorem 4. Given an instance of 3-DM with and , let be the capacitated undirected graph as constructed in the proof of Theorem 4 (illustrated in Figure 2). We also adopt supply vectors as in the proof of Theorem 4. For ease of reading, the vectors are redefined here. For any ,
The rest of the proof consists of two parts.
Part 1: a perfect matching in implies , where the subscript indicates that the objective value is under the unsplittable flow model.
The proof is identical to the counterpart of the proof of Theorem 4, so omitted here.
Part 2: If contains no perfect matching, .
Suppose the optimum targets are , and the multi-commodity flow is valid and satisfies . Under the unsplittable flow model, for any , where , called a summand path of , is a flow path from (or when ) to , and likewise for .
Let be the set of edges incident to vertices in . For , it is easy to observe two facts:
- Fact 1
-
: If , each summand path of has non-zero congestion on at least one edge in .
- Fact 2
-
: If , there are two summand paths of each having non-zero congestions on at least two edges in .
Now we proceed case by case.
Case 1: for some . By Facts 1 and 2, considering that there are edges in and summand paths of all the flows, there must be an edge shared by at least two flow paths. Since a flow path has congestion on any edge along it and each edge has capacity 1, we see that .
Case 2: for some . and altogetger have six summand paths, each of which arrives . However, has only three incident edges, so at least two of the summand paths share an incident edge of . Again, since a flow path has congestion on any edge along it and each edge has capacity 1, we have .
Case 3: ’s lie in and are pairwise different. Assume without loss of generality that , for any . Because contains no perfect matching, there exist such that . Again without loss of generality, assume .
If does not pass the edge , it must pass more than one edge in before reaching . Following the argument of Case 1, we see that . Likewise, we have if does not pass the edge .
What’s remaining is when both and pass the edge . One gets due to the capacity constraint on this edge. ∎
Then we show that restricting targets (i.e., targets can be chosen only in a candidate set of vertices) substantially affects the hardness of target location problems. Since the unrestricted version is a special case of the restricted one, all the hardness results (including the lower bounds of the approximation ratios) remain valid. In fact, restricting targets may make the problems harder, which is confirmed below. Recall Theorem 5 which claims that LoMuF on trees is polynomial-time solvable. Nevertheless, with restricted targets, LoMuF on trees even has no PTAS.
Before going on, let’s recall a property of 3-DM. Let be an instance of 3-DM. For any , define its covering set to be . It is known that remains NP-complete even on 3-covered instances, namely, [8, page 221].
Theorem 15.
LoMuF with restricted targets is NP-hard on trees and cannot be approximated within ratio in polynomial-time.
Proof.
We prove the theorem by reducing 3-DM to LoMuF.
Arbitrarily fix a 3-covered instance of 3-DM. Let and with . We will construct an instance of LoMuF with restricted targets, including a capacitated undirected graph , supply vectors, and a candidate set of vertices in which the targets of the supply vectors can be located.
Specifically, as illustrated in Figure 6, is a tree consisting of a root and the set of leaves, and the capacity of each edge is 6. All the edges are oriented from leaves to the root.

Let . For any , define a supply vector such that for any ,
For any , define a supply vector such that for any ,
Let the candidate set be , i.e., we are not allowed to choose the root as targets.
The rest of the proof consists of two parts.
Part 1: If contains a perfect matching, , where the subscript indicates that the candidate set for the targets is .
Without loss of generality, suppose is a perfect matching. Let be the mapping such that for any . For any , define flow such that for any edge ,
One can check that satisfies the demand vector .
Then arbitrarily fix a bijective mapping . For any , define flow such that for any edge ,
One can check that satisfies the demand vector .
Furthermore, it is easy to see that the flows form a valid multi-commodity flow. Hence we finishes the proof of Part 1.
Part 2: If contains no perfect matching, .
Let for , for be such that there is a valid multi-commodity flow for , for satisfying for , for .
First of all, for any edge , we can observe two facts:
(6) |
where and , and
(7) |
where . The detailed proof is omitted since the inequalities are immediate results of applying Lemma 3 to .
Then we proceed case by case.
Case 1: for some .
Let . By (6) and (7), the total congestion on edge satisfies . By capacity constraint on , we have .
Case 2: for some .
Let . By (6) and (7), the total congestion on edge satisfies . By capacity constraint on , we have .
Case 3: there exists such that .
Let . By (6), . By capacity constraint on , we have .
The rest of the proof will assume that none of the Cases 1-3 happens. Let and . We have
(8) |
By the pigeon hole principle, one further sees that for any , .
Case 4: there exists such that .
Let . Since and , by (6), . By capacity constraint on , we have .
Case 5: None of the above cases happens.
Since for any , which is at most due to (8). Recall that , so . As a result, for any , which implies that is a perfect matching, contradictory to the assumption that contains no perfect matching. Therefore, Case 5 never happens.
To sum up all the cases, . The proof ends. ∎
The following theorem is also a surprise. In the unrestricted case, if all the supply vectors are uni-source (i.e., each having a single source), a trivial optimum solution to LoMuF is choosing the sources themself as targets. However, when targets are restricted to a prescribed sets, LoMuF becomes NP-hard even on uni-source supply vectors and stars (i.e., trees of depth ).
Theorem 16.
LoMuF with restricted targets is NP-hard on uni-source supply vectors and stars.
Proof.
We prove the theorem via a reduction from 3-partition problem to LoMuF. For this end, given an instance of 3-partition problem, we set about to construct an instance of LoMuF with restricted targets, including a capacitated star, supply vectors, and a candidate set of targets.

Specifically, as illustrated in Figure 7, the capacitated undirected star consists of the center and the set of leaves . Orient every edge to point to . Each edge has capacity , where . For any , define supply vector such that and for any . Appoint to be the candidate set of targets.
Our proof will be done in two steps.
Step 1. If has an equi-partition, then .
Let be an equi-partition of . For any , let satisfy , and we define flow such that for any ,
One can check that satisfies demand vector .
Since is an equi-partition of , all these flows form a valid multi-commodity flow. Hence, we have .
Step 2. If , has an equi-partition.
Let be such that there is a valid multi-commodity flow satisfying . For any , let . Applying Lemma 3, we have for any . Due to the capacity constraint, one gets . As a result, , meaning that for any . Hence, is an equi-partition of . ∎
Then we discuss the target location version of the maximum multi-commodity problem. Arbitrarily fix supply vectors on a capacitated directed/undirected graph with vertex set , where is a finite index set. Roughly speaking, we are to locate targets for the supply vectors so as to maximize the total flow values. In particular, we have to find to maximize , where non-negative reals ’s are such that
-
1.
For any , there exists a flow satisfying the demand vector , and
-
2.
is a valid multi-commodity flow.
It is worth noting that all the preceding results in this paper still hold (and the proofs are also valid), except that we are not sure whether the lower bounds of approximation ratio remain true.
Finally, we investigate the target location version of the maximum feasibility problem (maxf-LoMuF for short). Intuitively, our goal is to locate the targets so as to maximize the number of satisfiable supply vectors. Formally, given a set of demand vectors on a capacitated network , its feasibility is defined to be the maximum subset of that can be simultaneously satisfied, namely, . Given supply vectors on a capacitated directed/undirected graph with vertex set , the task of maxf-LoMuF is to find so as to maximize . By abusing notation, the optimum objective value will also be denoted by .
We will show that maxf-LoMuF is hard to approximate. The proof relies on a reduction from the well-studied maximum independent set problem (MIS) which aims to find a maximum set of vertices that are pairwise non-adjacent in a given graph. Let’s first recall a property of MIS.
Lemma 17 ([11]).
For any constant , unless NP=ZPP, MIS can not be approximated within on graphs of vertices for any constant .
Theorem 18.
For any constant , unless NP=ZPP, the maxf-LoMuF problem on supply vectors cannot be approximated within in polynomial-time.
Proof.
We prove by reducing MIS to maxf-LoMuF. Namely, given a graph , we will construct a capacitated graph and supply vectors , where .
Specifically, , where . , where and . Every edge of has capacity 1. The graph is illustrated in Figure 8. We choose to orient every edge upward.

For any , define a supply vector such that for any ,
Arbitrarily fix a subset . We prove two claims:
-
•
Claim 1: If is an independent set of , then .
Suppose is an independent set of . For any , define flow such that for any with ,
It is easy to check that is a valid multi-commodity flow satisfying . Hence, .
-
•
Claim 2: If , then is an independent set of .
Assume . Choose such that there is a valid multi-commodity flow which satisfies .
We set about to show that is an independent set of . For contradiction, suppose are such that and are both incident to in . By Lemma 3, no matter where lies, we always have . Likewise, we also have . Considering that , we reach a contradiction. Claim 2 holds.
By Claims 1 and 2, for any -approximate solution to the instance of maxf-LoMuF, we can construct an -approximate solution to the instance of MIS, and vice versa. Then the theorem holds due to Lemma 17. ∎
We have a trivial approximation algorithm for maxf-LoMuF: Given a capacitated graph and supply vectors , by enumerating, find the first and such that . This algorithm obviously has approximation ratio , which is nearly optimum due to Theorem 18.
Remark 5.
The above results (including the algorithm) about maxf-LoMuF can be extended to directed graphs. The proofs remain valid up to minor modifications, so detailed proof are omitted here.
6 Conclusion
We formulated the target location problem for multi-commodity flows. It is a natural combination of the classic facility location problem and the multi-commodity flow problem, and extends both. It is interesting in theory and well-rooted in real-world applications.
We mainly study the issue of maximizing concurrent flows, both on directed and undirected networks. It is interesting to see that the directed case makes the problem harder: the problem is efficiently solvable on undirected trees, but NP-hard on di-paths. Another separation is that the problem is efficiently solvable for bi-source supply vectors on undirected graphs, while it is NP-hard for such supply vectors on directed graphs. We have also made progress on algorithm design: in addition to an exact algorithm on trees, an approximation algorithm is proposed for arbitrary undirected graphs, which leads to algorithms on symmetric directed graphs.
As the first step towards this novel direction, there remain numerous open questions. Just mention a few.
-
1.
Though an -approximation algorithm exists on undirected networks, we know nothing about the lower bound of approximation ratio of the problem. Even whether a PTAS exists remains open. The directed situation is less satisfactory: except a trivial algorithm with approximation ratio , no non-trivial approximation algorithm on general directed graphs is known.
-
2.
The variants deserve further studying. Since in many applications, targets can be chosen only from a candidate set, restricted version of our problem is of special interest. Cost minimization is an active topic in classic network flow problems. It can be easily defined in our framework, and is a rich research direction. One more variant has not yet been mentioned: This paper allows choosing just one target for each commodity, but what if more targets can be selected?
-
3.
Online versions of our problem are also well motivated. Recall the scenario of geo-distributed data analysis. The typical case is that the applications arrive sequentially in an online fashion, rather than all at once as we mentioned before. The online fashion poses special challenges in algorithm design. We are even not sure whether an algorithm exists with guaranteed competitive ratio.
Acknowledgement
We are grateful to the anonymous referees for detailed corrections and suggestions. We also thank Prof. Yungang Bao for his encouragement and support. Special thanks go to Dr. Laiping Zhao, who helped the authors formulating the problem.
References
- [1] H. An, M. Singh, and O. Svensson. Lp-based algorithms for capacitated facility location. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pages 256–265, 2014.
- [2] K. Andreev, C. Garrod, D. Golovin, B. M. Maggs, and A. Meyerson. Simultaneous source location. Acm Transactions on Algorithms, 6(1):Article No. 16, 2009.
- [3] K. Arata, S. Iwata, K. Makino, and S. Fujishige. Locating sources to meet flow demands in undirected networks. Journal of Algorithms, 42(1):54–68, 2002.
- [4] A. BernáTh. Source location in undirected and directed hypergraphs. Oper. Res. Lett., 36(3):355–360, 2008.
- [5] C. Chekuri, S. Khanna, and F. B. Shepherd. The all-or-nothing multicommodity flow problem. SIAM Journal on Computing, 42(4):1467–1493, 2013.
- [6] P. Christiano, J. Kelner, A. Madry, D. Spielman, and S.-H. Teng. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. Proceedings of the Annual ACM Symposium on Theory of Computing, page 273–282, June 2011.
- [7] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms. MIT press, 2009.
- [8] M. R. Gary and D. S. Johnson. Computers and intractability: A guide to the theory of np-completeness, 1979.
- [9] S. Guha and S. Khuller. Greedy strikes back: Improved facility location algorithms. Journal of Algorithms, 31(1):228–248.
- [10] S. L. Hakimi. Optimum locations of switching centers and the absolute centers and medians of a graph. Operations Research, 12(3):450–459, june 1964.
- [11] J. Hastad. Clique is hard to approximate within . Acta Mathematica, 182(1):105–142, 1996.
- [12] P. Hebler and H. W. Hamacher. Sink location to find optimal shelters in evacuation planning. EURO Journal on Computational Optimization, 4(3):325–347, 2016.
- [13] H. Ito, M. Paterson, and K. Sugihara. The multi-commodity source location problems and the price of greed. Journal of Graph Algorithms and Applications, 13(1):55–73, 2009.
- [14] J. A. Kelner, Y. T. Lee, L. Orecchia, and A. Sidford. An almost-linear-time algorithm for approximate max flow in undirected graphs and its multicommodity generalizations. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), page 217–226, USA, 2014. Society for Industrial and Applied Mathematics.
- [15] J. A. Kelner, G. L. Miller, and R. Peng. Faster approximate multicommodity flow using quadratically coupled flows. In Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing, page 1–18, New York, NY, USA, 2012. Association for Computing Machinery.
- [16] S. Khuller. Data-Aware Scheduling in Datacenters. PhD thesis, University of Maryland (College Park, Md.), 2016.
- [17] P. Kolman and C. Scheideler. Improved bounds for the unsplittable flow problem. In Thirteenth Acm-siam Symposium on Discrete Algorithms, 2002.
- [18] G. Kortsarz and Z. Nutov. A note on two source location problems. Journal of Discrete Algorithms, 6(3):520–525, sep 2008.
- [19] R. Krauthgamer, J. R. Lee, and H. Rika. Flow-cut gaps and face covers in planar graphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 525–534. SIAM, 2019.
- [20] M. Labbe, D. Peeters, and J.-F. Thisse. Location on networks. Network Routing, 8, 02 1992.
- [21] Y. T. Lee, S. Rao, and N. Srivastava. A new approach to computing maximum flows using electrical flows. In Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing, page 755–764, 2013.
- [22] S. Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. In Automata, Languages & Programming-international Colloquium, Icalp, Zurich, Switzerland, July, Part II, pages 45–58.
- [23] A. Madry. Computing maximum flow with augmenting electrical flows. In Proceedings of the IEEE 57th Annual Symposium on Foundations of Computer Science, pages 593–602, 2016.
- [24] S. Mamada, K. Makino, and S. Fujishige. Optimal sink location problem for dynamic flows in a tree network. Ieice Transactions on Fundamentals of Electronics Communications and Computer Sciences, E85-A(5):1020–1025, 2002.
- [25] S. Mamada, T. Uno, K. Makino, and S. Fujishige. An algorithm for the optimal sink location problem in dynamic tree networks. Discrete Applied Mathematics, 154(16):2387–2401, 2006.
- [26] L. Monis, B. Kunjumon, and N. Guruprasad. Implementation of Maximum Flow Algorithm in an Undirected Network, pages 195–202. Springer Singapore, Singapore, 01 2019.
- [27] M. T. Y. J. Naga, U. S., N. A., and G. S. A technical survey on optimization of processing geo-distributed data. Journal of Physics: Conference Series, 1000:012140, 2018.
- [28] R. Peng. Approximate undirected maximum flows in o(mpolylog(n)) time. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, page 1862–1867, 2016.
- [29] Q. Pu, G. Ananthanarayanan, P. Bodik, S. Kandula, A. Akella, P. Bahl, and I. Stoica. Low latency geo-distributed data analytics. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, SIGCOMM ’15, page 421–434, 2015.
- [30] M. Sakashita, K. Makino, and S. Fujishige. Minimum cost source location problems with flow requirements. Algorithmica, 50(4):555–583, 2006.
- [31] A. Salmasi, A. Sidiropoulos, and V. Sridhar. On constant multi-commodity flow-cut gaps for families of directed minor-free graphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 535–553. SIAM, 2019.
- [32] F. B. Shepherd, A. Vetta, and G. T. Wilfong. Polylogarithmic approximations for the capacitated single-sink confluent flow problem. In Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science, pages 748–758, 2015.
- [33] J. Sherman. Area-convexity, regularization, and undirected multicommodity flow. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 452–460, 2017.
- [34] D. B. Shmoys, va Tardos, and K. Aardal. Approximation algorithms for facility location problems (extended abstract). In Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, El Paso, Texas, USA, May 4-6, 1997, 1997.
- [35] H. Tamura, M. Sengoku, S. Shinoda, and T. Abe. Location problems on undirected flow networks. IEICE Trans., E73:1989–1993, 1990.
- [36] B. C. Tansel, R. L. Francis, T. J. Lowe, M. Science, and N. Apr. Location on networks : A survey . part ii : Exploiting tree network structure. Management Science, 29(4):498–511, 1983.
- [37] V. V. Vazirani. Approximation Algorithms. Springer, 2001.
- [38] L. Yin, J. Sun, L. Zhao, C. Cui, J. Xiao, and C. Yu. Joint scheduling of data and computation in geo-distributed cloud systems. In 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pages 657–666, 2015.
- [39] L. Yue, L. Zhao, C. Cui, and C. Yu. Fast big data analysis in geo-distributed cloud. In IEEE International Conference on Cluster Computing, 2016.
- [40] L. Zhao, Y. Yang, A. Munir, A. X. Liu, Y. Li, and W. Qu. Optimizing geo-distributed data analytics with coordinated task scheduling and routing. IEEE Transactions on Parallel and Distributed Systems, 31(2):279–293, 2020.