This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Improved Approximation Guarantees for Power Scheduling Problems With Sum-of-Squares Constraints

Trung Thanh Nguyen1, Khaled Elbassioni2, Areg Karapetyan3, Majid Khonji2
Abstract

We study a class of combinatorial scheduling problems characterized by a particular type of constraint often associated with electrical power or gas energy. This constraint appears in several practical applications and is expressed as a sum of squares of linear functions. Its nonlinear nature adds complexity to the scheduling problem, rendering it notably challenging, even in the case of a linear objective. In fact, exact polynomial time algorithms are unlikely to exist, and thus, prior works have focused on designing approximation algorithms with polynomial running time and provable guarantees on the solution quality. In an effort to advance this line of research, we present novel approximation algorithms yielding significant improvements over the existing state-of-the-art results for these problems.

Introduction

Management of power systems involves operational routines concerned with the optimal allocation/dispatch of limited resources (power/generation) under various types of objectives and constraints, including total fuel cost, total profit, emissions, active power loss, and voltage magnitude deviation, to name a few. These problems are conventionally modeled as quadratic programs (possibly with discrete variables), which are in general hard to solve exactly or approximately. In this paper, we focus on a class of scheduling problems where the energy constraints are modeled as a sum of squares of linear functions. Such constraints have been studied in a number of works, including electric vehicle charging in smart grids (Yu and Chau 2013; Karapetyan et al. 2018; Elbassioni, Karapetyan, and Nguyen 2019; Khonji et al. 2019); welfare maximization in gas supply networks (Klimm et al. 2022), and scheduling of tasks on machines (Wierman, Andrew, and Tang 2012; Bansal, Kimbrel, and Pruhs 2004).

In (Yu and Chau 2013), the authors investigated the so-called Complex-Demand Knapsack Problem (CKP), formulated to optimize the distribution of power among consumers within Alternating Current (AC) electrical systems. In CKP, power (representing user demand) is expressed in complex numbers (Grainger and Stevenson 1994), where the real part corresponds to active power, the imaginary part stands for reactive power, and the magnitude of the complex number measures the apparent power. Each customer possesses a specific power demand and receives a certain utility if their demand is fully met. However, due to physical limitations on the magnitude of the total power supply available in the system, only a subset of users can be selected for power allocation. Mathematically, the power constraint is formulated as a quadratic function, which is the sum of two squares of linear functions.

The sum-of-squares constraint has also appeared in the context of gas supply networks (Klimm et al. 2022). As noted in (Klimm et al. 2022), a gas pipeline network can be modeled as a directed path graph with different entry and exit nodes. Transportation requests are characterized by their entry and exit nodes and the amounts of gas to be transported. The gas flow on an edge from node uu to a node vv is modeled by the Weymouth equations (Weymouth 1912), describing the difference of the squared pressures between these nodes. For the operation of gas networks, one needs to choose a subset of transportation requests to be served such that the maximum difference of the squared pressures in the network does not exceed a given threshold. This pressure constraint can then be formulated as a sum of squares of linear functions.

Mathematically, the above two problems can be cast as a binary program that seeks to maximize a linear objective function subject to a sum-of-squares constraint. This problem covers the classical Knapsack problem as a special case and is thus NP\mathrm{NP}-hard. The first algorithmic result was given in (Woeginger 2000), where it was proved that the problem is strongly NP\mathrm{NP}-hard, and thus does not admit a fully polynomial time approximation scheme (FPTAS). Yu and Chau (Yu and Chau 2013) developed a (0.5ϵ)(0.5-\epsilon)-approximation algorithm, which was subsequently improved to a PTAS in (Chau, Elbassioni, and Khonji 2016), the best possible result one can hope for the problem. In (Elbassioni and Nguyen 2017) the authors further extended this result to the case where the constraint contains a constant number of squares and devised a (1eϵ\frac{1}{e}-\epsilon)-approximation algorithm for variant with a submodular objective. Recently, (Klimm et al. 2022) studied the version with an unbounded number of squares and presented a constant 0.6180.618-approximation algorithm for a linear objective and ruled out the existence of an approximation factor of 0.990.99. These results rely on the so-called pipage rounding technique. One interesting question raised here is whether this rounding technique can be extended to nonlinear objectives such as quadratic and submodular functions, which are frequently encountered in real-world economic contexts.

The minimization version of CKP (𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP}) with a linear objective and a sum-of-two-squares constraint has been explored in (Elbassioni, Karapetyan, and Nguyen 2019). As noted therein, this version constitutes a special case of the Economic Dispatch problem (a.k.a, Unit Commitment, Generation Dispatch Control), which is principal to power systems engineering and is responsible for minimizing generation costs while maintaining the balance between the net supply and demand (Wood and Wollenberg 2012). Unlike the maximization version, however, 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP}’s natural relaxation is non-convex and hence harder to approximate. As of now, the best-known approximation was the quasi-PTAS (QPTAS) attained in (Elbassioni, Karapetyan, and Nguyen 2019) via a geometric approach, whereas the existence of a PTAS remained an open question.

Taking a step forward, we present two novel approximations for 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP} and a generalized variant of CKP. More concretely, the contributions of this study are as follows:

  • First, we provide a PTAS for 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP}, which is a notable improvement upon the state-of-the-art result of a QPTAS provided in (Elbassioni, Karapetyan, and Nguyen 2019). At the heart of the approximation scheme is a polynomial time algorithm for solving the (non-convex) relaxation problem to optimality, by exploiting the special structure of optimal solutions. Note that a PTAS is the best possible result one can hope for, unless P=NP\mathrm{P}=\mathrm{NP}, as one can use a technique similar to that in (Woeginger 2000)111In (Woeginger 2000) the author provided a reduction from a variant of Partition to disprove the existence of an FPTAS for the CKP problem. With some minor modifications, this reduction can also be established for the minimization version of CKP. to rule out the non-existence of an FPTAS.

  • Next, we show that any α\alpha-approximation algorithm for the relaxation problem maximizing a (non-monotone) submodular quadratic function over a general (non-negative) convex quadratic constraint can lead to an (αϕ22)(\frac{\alpha\phi^{2}}{2})-approximation algorithm for the discrete case, where ϕ=21+5\phi=\frac{2}{1+\sqrt{5}} is the reverse golden ratio. The idea is to generalize the technique of (Klimm et al. 2021) developed for a similar problem with a linear objective: given a fractional solution, round it in such a way that the resulting discrete solution remains feasible and the objective value does not decrease. Since in the studied problem both the objective and constraint are quadratic, we have to develop a more sophisticated rounding technique than that derived in (Klimm et al. 2022) for the linear objective case.

Remark

A common technique for designing approximation algorithms is to first solve the relaxation to obtain an optimal (fractional) solution, then to round it to an integer one, with a small loss in the objective value. It should be noted that, however, 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP}’s straightforward relaxation is non-convex, and thus, finding an efficiently computable lower/upper bound on the objective is by itself a major challenge.

Problem Formulations and Notation

In this section, we formalize the mathematical models of the two power scheduling problems under study. The first one, termed 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP}, is the minimization version of CKP (Yu and Chau 2013) and is defined as

(𝖬𝗂𝗇𝖢𝖪𝖯)min𝐱{0,1}n\displaystyle(\mathsf{MinCKP})~{}\min_{{\bf x}\in\{0,1\}^{n}}\quad 𝐜𝐱\displaystyle\displaystyle{\bf c}^{\top}{\bf x}
s.t. (𝐩𝐱)2+(𝐪𝐱)2C,\displaystyle\displaystyle({\bf p}^{\top}{\bf x})^{2}+({\bf q}^{\top}{\bf x})^{2}\geq C, (1)

where 𝐩,𝐪+n{\bf p},{\bf q}\in\mathbb{Q}_{+}^{n}, 𝐜n{\bf c}\in\mathbb{Q}^{n} and C>0C>0. We may assume, w.l.o.g., that 𝐜>0{\bf c}>0 (if ci0c_{i}\leq 0 for some i[n]i\in[n], the corresponding variable xix_{i} can be set to 11). Also, it is assumed that 𝐩,𝐪{\bf p},{\bf q} are non-negative, otherwise the problem doesn’t admit an approximation algorithm with finite ratio (Chau, Elbassioni, and Khonji 2016). In 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP}, the scalar CC denotes the aggregate apparent power demand to be met, while the column vectors 𝐩{\bf p}^{\top} and 𝐪{\bf q}^{\top} capture the active and reactive power outputs of the contracted generation units. Accordingly, the vector 𝐜{\bf c} represents the costs associated with dispatching these units. These costs may quantify the expenses carried by the load-serving entity (operator of the grid), including the utilization of a generation source, or alternatively, the amount paid due to CO2 emissions.

The second problem, referred to as 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}, generalizes the one studied by (Elbassioni and Nguyen 2017; Klimm et al. 2022) and takes the following form:

(𝖬𝖺𝗑𝖲𝖴𝖡)max𝐱{0,1}n\displaystyle(\mathsf{MaxSUB})~{}\max_{{\bf x}\in\{0,1\}^{n}}\quad 𝐱TA𝐱\displaystyle\displaystyle{\bf x}^{T}A{\bf x}
s.t. k=1K(𝐩k𝐱)2C,\displaystyle\displaystyle{\sum}_{k=1}^{K}({\bf p}_{k}^{\top}{\bf x})^{2}\leq C, (2)

where KnK\leq n, 𝐩k+n{\bf p}_{k}\in\mathbb{Q}_{+}^{n} and the objective is submodular, i.e., the matrix AA has non-positive off-diagonal entries. Let 𝐚{\bf a} be the vector of diagonal entries of AA. One can assume, w.l.o.g., that 𝐚{\bf a} is non-negative, since if ai<0a_{i}<0 then setting xi=0x_{i}=0 does not affect the optimal solution. The constraint (2) can be written in the form of 𝐱TQ𝐱C{\bf x}^{T}Q{\bf x}\leq C, for some positive semi-definite (PSD) matrix Q+n×nQ\in\mathbb{Q}_{+}^{n\times n}. Since 𝐱{0,1}n{\bf x}\in\{0,1\}^{n}, one can write the objective function in the equivalent form of f(𝐱):=𝐱A𝐱+𝐚𝐱f({\bf x}):={\bf x}^{\top}A^{*}{\bf x}+{\bf a}^{\top}{\bf x}, where AA^{*} is obtained from AA by setting all the diagonal entries to 0, without changing optimal solutions. It is not hard to verify that in this case the (discrete) function f(𝐱)f({\bf x}) (over {0,1}n\{0,1\}^{n}) satisfies submodularity, i.e., f(𝐱S)+f(𝐱V)f(𝐱SV)+f(𝐱SV)f({\bf x}^{S})+f({\bf x}^{V})\geq f({\bf x}^{S\cup V})+f({\bf x}^{S\cup V}), for every two sets S,V{1,2,,n}S,V\subseteq\{1,2,\ldots,n\}, where 𝐱S{\bf x}^{S} denotes the binary vector with xiS=1x_{i}^{S}=1 if and only if iSi\in S.

Approximation algorithms

For completeness, we briefly review the definition of approximation algorithms and schemes. For α>0\alpha>0, a vector 𝐱{0,1}n{\bf x}\in\{0,1\}^{n} is said to be an α\alpha-approximate solution for a maximization problem (resp., minimization problem) with an objective ff, if 𝐱{\bf x} is a feasible solution satisfying f(𝐱)αOPTf({\bf x})\geq\alpha\cdot\mathrm{OPT} (resp., f(𝐱)αOPTf({\bf x})\leq\alpha\cdot\mathrm{OPT}), where OPT\mathrm{OPT} is the value of an optimal solution. For simplicity, if α=1ϵ\alpha=1-\epsilon (resp., 1+ϵ1+\epsilon), for ϵ>0\epsilon>0, we shall call 𝐱{\bf x} ϵ\epsilon-optimal. Recall that a PTAS is an algorithm that runs in time polynomial in the input size nn, for every fixed ϵ\epsilon, and outputs an ϵ\epsilon-optimal solution. A QPTAS is similar to a PTAS but the running time is quasi-polynomial (i.e., of the form n𝗉𝗈𝗅𝗒𝗅𝗈𝗀nn^{\mathsf{polylog}n}), for every fixed ϵ\epsilon. Finally, a PTAS will become an FPTAS if its running time is polynomial in 1ϵ\frac{1}{\epsilon}.

Notations

We write [n]:={1,2,,n}[n]:=\{1,2,\ldots,n\} for any positive integer nn. For two vectors 𝐱,𝐲{\bf x},{\bf y} we write 𝐱𝐲{\bf x}\leq{\bf y} if xiyix_{i}\leq y_{i} for all i[n]i\in[n]. For a vector 𝐱n{\bf x}\in\mathbb{R}^{n} and a subset S[n]S\subseteq[n], we write 𝐱(S):=iSxi{\bf x}(S):=\sum_{i\in S}x_{i}. Let 𝐝1=i[n]|di|\|{\bf d}\|_{1}=\sum_{i\in[n]}|d_{i}| and 𝐝2=i[n]di2\|{\bf d}\|_{2}=\sum_{i\in[n]}d_{i}^{2} denote the 1\ell_{1}-norm and 2\ell_{2}-norm of vector 𝐝{\bf d}, respectively. We denote by 𝟏{\bf 1} and 𝟎{\bf 0} the vectors (or matrices of appropriate dimensions) of all 11s and 0s, respectively. Lastly, with a slight abuse of notation, we shall refer to the value of an optimal solution of a given instance of 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP} or 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB} problems by OPT\mathrm{OPT}.

An Approximation Scheme for MinCKP

In this section, we present a PTAS for 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP}. As previously remarked, the relaxation with continuous variables in [0,1][0,1] is non-convex and thus cannot be solved by existing techniques for convex (quadratic) programs. Interestingly, by exploiting the structure of a class of the relaxation’s optimal solutions, and carefully analyzing an equivalent dual program, we can reduce the space of candidates so that an optimal solution can be found efficiently. It can then be used to derive an integer solution with a bounded guarantee.

Theorem 1

There is a PTAS for 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP}.

Denote by 𝖭𝖫𝖯\mathsf{NLP} the (non-convex) relaxation of 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP}. It might be the case that the optimum solution 𝐱{\bf x} for 𝖭𝖫𝖯\mathsf{NLP} is attained at the boundary of the feasible set, and thus has irrational components. Hence, by saying “an optimal solution” we essentially mean a rational solution ^𝐱[0,1]n\widehat{}{\bf x}\in[0,1]^{n} with 𝐜^𝐱𝐜𝐱+δ{\bf c}^{\top}\widehat{}{\bf x}\leq{\bf c}^{\top}{\bf x}^{*}+\delta and (𝐩^𝐱)2+(𝐪^𝐱)2Cδ({\bf p}^{\top}\widehat{}{\bf x})^{2}+({\bf q}^{\top}\widehat{}{\bf x})^{2}\geq C-\delta, for an arbitrary small δ>0\delta>0, where 𝐱{\bf x}^{*} is an optimum solution for 𝖭𝖫𝖯\mathsf{NLP}. For a given δ>0\delta>0, we say that ^𝐱\widehat{}{\bf x} is δ\delta-feasible for 𝖭𝖫𝖯\mathsf{NLP} if it satisfies the second inequality, and δ\delta-optimal if it satisfies both inequalities. At the heart of our method is the following key result.

Theorem 2

For any given δ>0\delta>0, a δ\delta-optimal solution for the relaxation 𝖭𝖫𝖯\mathsf{NLP} can be computed in time 𝗉𝗈𝗅𝗒(L,log1δ)\mathsf{poly}(L,\log\frac{1}{\delta}), where LL is the length of the binary representation of the input vectors instance.

We assume that 𝐩12+𝐪12C\|{\bf p}\|_{1}^{2}+\|{\bf q}\|_{1}^{2}\geq C, so that the problem is feasible. We also assume that 𝐩{\bf p} and 𝐪{\bf q} are linearly independent vectors; otherwise, the problem simplifies to a linear program. We will prove that, given a fixed positive value TT, the problem of finding a δ\delta-feasible solution, if it exists, of value exactly TT can be achieved in polynomial time. We denote this problem by 𝖭𝖫𝖯[T]\mathsf{NLP}[T]. As the objective function and the constraint (2) are both monotone222Note that, given a feasible solution to 𝖭𝖫𝖯\mathsf{NLP} with value 𝐜𝐱<T{\bf c}^{\top}{\bf x}<T, we can increase the components of 𝐱{\bf x}, one by one, without violating the inequality (2), until either 𝐜𝐱{\bf c}^{\top}{\bf x} becomes equal to TT, or xix_{i} becomes equal to 11 for all ii; the latter case cannot happen unless T𝐜1T\geq\|{\bf c}\|_{1}. Thus, checking if the optimum value of 𝖭𝖫𝖯\mathsf{NLP} is at most TT is equivalent to checking if 𝖭𝖫𝖯[T]\mathsf{NLP}[T] is feasible., this result combined with a binary search (via Algorithm 1) yields an efficient algorithm for finding a δ\delta-optimal solution to 𝖭𝖫𝖯\mathsf{NLP}, and hence proving Theorem 2.

Algorithm 1 Binary Search For Solving 𝖭𝖫𝖯\mathsf{NLP}
1:Vectors 𝐜,𝐩,𝐪+n{\bf c},{\bf p},{\bf q}\in\mathbb{Q}_{+}^{n}, a rational number C>0C>0, and an accuracy δ(0,1)\delta\in(0,1)
2:A δ\delta-optimal solution 𝐱{\bf x} for 𝖭𝖫𝖯\mathsf{NLP}
3:𝖫𝖡1\mathsf{LB}\leftarrow 1; 𝖴𝖡𝐜𝟏\mathsf{UB}\leftarrow{\bf c}^{\top}{\bf 1}
4:while 𝖴𝖡𝖫𝖡>δ\mathsf{UB}-\mathsf{LB}>\delta do T=12(𝖴𝖡𝖫𝖡)T=\frac{1}{2}(\mathsf{UB}-\mathsf{LB})
5:     Solve 𝖭𝖫𝖯[T]\mathsf{NLP}[T]
6:     if 𝖭𝖫𝖯[T]\mathsf{NLP}[T] has a δ\delta-feasible solution 𝐱[0,1]n{\bf x}\in[0,1]^{n} then
7:         𝖴𝖡T\mathsf{UB}\leftarrow T
8:     else
9:         𝖫𝖡T\mathsf{LB}\leftarrow T      
10:return 𝐱{\bf x}
Lemma 1

For a given T,δ>0T,\delta>0, one can find a δ\delta-feasible solution to 𝖭𝖫𝖯[T]\mathsf{NLP}[T] by solving O(n2)O(n^{2}) linear programs.

Proof.    A high-level description. We rely on the structural property of optimal solutions of 𝖭𝖫𝖯\mathsf{NLP}, which will be analyzed using the dual of the linear program (𝖫𝖯{\mathsf{LP}}) given below. Based on the values of the (basic feasible) optimal dual solution of this linear program, we can classify the variables in an optimal solution for 𝖭𝖫𝖯\mathsf{NLP} into those having certainly a value in {0,1}\{0,1\} and those who are possibly taking values in (0,1)(0,1). Each variable in the latter set (denoted by Δ\Delta below) has the property that the corresponding objective cost can be written as a linear combination of the corresponding pip_{i} and qiq_{i}. This property together with the assumption that 𝐜𝐱=T{\bf c}^{\top}{\bf x}=T allows us to eliminate one of the sums ξ:=iΔpixi\xi:=\sum_{i\in\Delta}p_{i}x_{i}, ζ:=iΔqixi\zeta:=\sum_{i\in\Delta}q_{i}x_{i}, leading to a single quadratic inequality in only one variable ξ\xi (or ζ\zeta), which can be solved exactly, thus reducing the problem of checking the feasibility of 𝖭𝖫𝖯[T]\mathsf{NLP}[T] to solving at most 22 linear programs. The formal procedure is given in Algorithm 2.

We first give a structural property of an optimal solution of 𝖭𝖫𝖯\mathsf{NLP}, which can help to check the feasibility of 𝖭𝖫𝖯[T]\mathsf{NLP}[T] efficiently. Let 𝐱{\bf x}^{*} be an optimal solution to 𝖭𝖫𝖯\mathsf{NLP}. We will use 𝐱{\bf x}^{*} as a parameter in analyzing the structure of an optimal solution. Let α=𝐩𝐱\alpha={\bf p}^{\top}{\bf x}^{*} and β=𝐪𝐱\beta={\bf q}^{\top}{\bf x}^{*}. Consider the following linear program:

(𝖫𝖯)min𝐱[0,1]n\displaystyle({\mathsf{LP}})~{}\min_{{\bf x}\in[0,1]^{n}}\qquad 𝐜𝐱\displaystyle\displaystyle{\bf c}^{\top}{\bf x}
s.t 𝐩𝐱=α,𝐪𝐱=β.\displaystyle\displaystyle{\bf p}^{\top}{\bf x}=\alpha,~{}~{}{\bf q}^{\top}{\bf x}=\beta.

Note that 𝖫𝖯{\mathsf{LP}} and 𝖭𝖫𝖯\mathsf{NLP} have the same optimal solution. Indeed, it can be seen that every feasible solution to 𝖫𝖯{\mathsf{LP}} is also feasible to 𝖭𝖫𝖯\mathsf{NLP}. Since both programs have the same objective, every optimal solution to 𝖫𝖯{\mathsf{LP}} must also be optimal to 𝖭𝖫𝖯\mathsf{NLP}. In what follows we make use of the dual of 𝖫𝖯{\mathsf{LP}} in analyzing the structure of such a solution.

Algorithm 2 Finding a δ\delta-Feasible Solution
1:Vectors 𝐜,𝐩,𝐪+n{\bf c},{\bf p},{\bf q}\in\mathbb{Q}_{+}^{n} and numbers C,T+C,T\in\mathbb{Q}_{+}, and an accuracy δ(0,1)\delta\in(0,1)
2:Either a δ\delta-feasible solution 𝐱{\bf x} to 𝖭𝖫𝖯[T]\mathsf{NLP}[T] or declare that 𝖭𝖫𝖯[T]\mathsf{NLP}[T] is not feasible
3:for each pair (j,k)[n]×[n](j,k)\in[n]\times[n] with j<kj<k do
4:     if (pj,pk)(p_{j},p_{k}) and (qj,qk)(q_{j},q_{k}) are linearly independent then
5:         Compute (z1,z2)(z_{1},z_{2}) such that
pjz1+qjz2=cj;pkz1+qkz2=ckp_{j}z_{1}+q_{j}z_{2}=c_{j};~{}p_{k}z_{1}+q_{k}z_{2}=c_{k}
6:         Δ0{i[n]|piz1+qiz2<ci}\Delta_{0}\leftarrow\{i\in[n]~{}|~{}p_{i}z_{1}+q_{i}z_{2}<c_{i}\}
7:         Δ1{i[n]|piz1+qiz2>ci}\Delta_{1}\leftarrow\{i\in[n]~{}|~{}p_{i}z_{1}+q_{i}z_{2}>c_{i}\}
8:         xi1x_{i}\leftarrow 1 for all iΔ1i\in\Delta_{1}
9:         xi0x_{i}\leftarrow 0 for all iΔ0i\in\Delta_{0}
10:         Δ[n]Δ1Δ0\Delta\leftarrow[n]\setminus\Delta_{1}\cup\Delta_{0}
11:         ξiΔpixi\xi\leftarrow\sum_{i\in\Delta}p_{i}x_{i}
12:         ζiΔpixi\zeta\leftarrow\sum_{i\in\Delta}p_{i}x_{i}
13:         Reduce 𝖭𝖫𝖯[T]\mathsf{NLP}[T] to a system of variables ξ,ζ\xi,\zeta
14:         if z2>0z_{2}>0 then
15:              Compute O(δ2L)O(\frac{\delta}{2^{L}})-approximations ξ~1ξ~2\tilde{\xi}_{1}\leq\tilde{\xi}_{2} of the roots of (6) in ξ\xi, otherwise set ξ~1=\tilde{\xi}_{1}=-\infty; ξ~2=+\tilde{\xi}_{2}=+\infty
16:              ξ¯𝐩(Δ)\overline{\xi}\leftarrow{\bf p}(\Delta)
17:              Ω(ξ)[0,ξ¯]((,ξ~1][ξ~2,+])\Omega(\xi)\leftarrow[0,\overline{\xi}]~{}\bigcap\left(~{}(-\infty,\tilde{\xi}_{1}]\cup[\tilde{\xi}_{2},+\infty]\right)
18:              for IΩ(ξ)I\in\Omega(\xi) do
19:                  Define LP as
{z1(iΔpixi)+z2(iΔqixi)=T𝐜(Δ1)iΔpixiI,xi[0,1],iΔ.\left\{\begin{matrix}z_{1}^{*}\cdot\left(\sum_{i\in\Delta}p_{i}x_{i}\right)+z_{2}^{*}\cdot\left(\sum_{i\in\Delta}q_{i}x_{i}\right)=T-{\bf c}(\Delta_{1})\\ \sum_{i\in\Delta}p_{i}x_{i}\in I,\qquad x_{i}\in[0,1],~{}\quad\forall~{}i\in\Delta.\end{matrix}\right.
20:                  if LP has a solution (xi)iΔ(x_{i})_{i\in\Delta} then
21:                       return 𝐱{\bf x}                                 
22:         else\triangleright z1>0z_{1}>0
23:              Similar to lines 15-21 (omitted for brevity)               
24:return𝖭𝖫𝖯[T]\mathsf{NLP}[T] is not feasible”

The dual of (𝖫𝖯{\mathsf{LP}}) is

max𝐳,𝐲\displaystyle\max_{{\bf z},{\bf y}}\qquad αz1+βz2𝟏𝐲\displaystyle\displaystyle\alpha\cdot z_{1}+\beta\cdot z_{2}-{\bf 1}^{\top}{\bf y}
s.t. piz1+qiz2yici,i[n],\displaystyle\displaystyle p_{i}\cdot z_{1}+q_{i}\cdot z_{2}-y_{i}\leq c_{i},\quad\forall~{}i\in[n], (3)
yi0,i[n].\displaystyle\displaystyle y_{i}\geq 0,\hskip 86.78099pt\forall~{}i\in[n]. (4)

Since the primal is feasible with the primal solution 𝐱{\bf x}^{*}, then the dual is also feasible and let (𝐳,𝐲)({\bf z}^{*},{\bf y}^{*}) be an optimal (basic feasible) dual solution333Note that the linear independence of 𝐩{\bf p} and 𝐪{\bf q} guarantees that the optimum of the dual is attained at a vertex. corresponding to the primal solution 𝐱{\bf x}^{*}. The optimum values of the primal-dual pair coincide by the strong duality criterion. Moreover, by the complementary slackness condition, it holds that, for all i[n]i\in[n],

(i)

xi=0orpiz1+qiz2yici=0x^{*}_{i}=0~{}\text{or}~{}p_{i}\cdot z^{*}_{1}+q_{i}\cdot z^{*}_{2}-y^{*}_{i}-c_{i}=0,

(ii)

xi=1oryi=0.x^{*}_{i}=1~{}\text{or}~{}y^{*}_{i}=0.

The complementary slackness criterion implies that the primal solution 𝐱{\bf x}^{*} satisfies the following conditions

xi={0ifpiz1+qiz2<ci,1ifpiz1+qiz2>ci.fori[n].()x^{*}_{i}=\left\{\begin{matrix}0&\text{if}&p_{i}\cdot z^{*}_{1}+q_{i}\cdot z^{*}_{2}<c_{i},\\ 1&\text{if}&p_{i}\cdot z^{*}_{1}+q_{i}\cdot z^{*}_{2}>c_{i}.\end{matrix}\right.\quad\text{for}~{}\forall~{}i\in[n].\qquad(\star)

Note that xix^{*}_{i} can take any value in [0,1][0,1] when the equality piz1+qiz2=cip_{i}\cdot z^{*}_{1}+q_{i}\cdot z^{*}_{2}=c_{i} happens.

The following observation is due to the linear independence of 𝐩{\bf p} and 𝐪{\bf q} and the fact that (𝐳,𝐲)({\bf z}^{*},{\bf y}^{*}) is assumed to be a basic feasible solution to the dual.

Observation 1

There must be a pair (j,k)[n]×[n](j,k)\in[n]\times[n] such that the two vectors (pj,pk)(p_{j},p_{k}) and (qj,qk)(q_{j},q_{k}) are linearly independent and the dual vector z=(z1,z2)z^{*}=(z^{*}_{1},z^{*}_{2}) fulfills the system of equations {pjz1+qjz2=cj,pkz1+qkz2=ck}\{p_{j}\cdot z_{1}+q_{j}\cdot z_{2}=~{}c_{j},~{}p_{k}\cdot z_{1}+q_{k}\cdot z_{2}=~{}c_{k}\}.

Indeed, since (𝐳,𝐲)({\bf z}^{*},{\bf y}^{*}) is a basic feasible solution to the dual, there must exist n+2n+2 linearly independent constraints among the constraints (3)-(4) that are tight at (𝐳,𝐲)({\bf z}^{*},{\bf y}^{*}). This implies that we must have two distinct indices j,k[n]j,k\in[n] such that the two inequalities (3) and (4) hold as equalities for i=j,ki=j,k and such that the four obtained equations are linearly independent. This leads to the claim in the observation.

The above observation implies that z1,z2z^{*}_{1},z^{*}_{2} are uniquely and fully determined by a 2×22\times 2 linear system obtained by selecting two indices j,k[n]j,k\in[n] such that the two vectors (pj,pk)(p_{j},p_{k}) and (qj,qk)(q_{j},q_{k}) are linearly independent. Once the solutions z1,z2z_{1}^{*},z_{2}^{*} are obtained, one can classify the values of variables in the primal solution 𝐱{\bf x} as follows

  • If piz1+qiz2<cip_{i}\cdot z^{*}_{1}+q_{i}\cdot z^{*}_{2}<c_{i}, then xi=0x_{i}=0,

  • If piz1+qiz2>cip_{i}\cdot z^{*}_{1}+q_{i}\cdot z^{*}_{2}>c_{i}, then xi=1x_{i}=1,

  • If piz1+qiz2=cip_{i}\cdot z^{*}_{1}+q_{i}\cdot z^{*}_{2}=c_{i}, then xi[0,1]x_{i}\in[0,1].

In other words, based on the values of z1,z2z^{*}_{1},z^{*}_{2}, we know that some of the variables in the primal solution must be equal to 11 or 0, but possibly not all. Define Δ1,Δ0\Delta_{1},\Delta_{0} to be the sets containing indices ii with xi=1x^{*}_{i}=1 and xi=0x^{*}_{i}=0, respectively, and let Δ=[n]Δ1Δ0\Delta=[n]\setminus\Delta_{1}\cup\Delta_{0}. Then the values of xix^{*}_{i}’s, iΔi\in\Delta, must be the optimal to the 𝖭𝖫𝖯\mathsf{NLP} program (𝖭𝖫𝖯[Δ]\mathsf{NLP}[\Delta]), parameterized by Δ\Delta:

min\displaystyle\min~{} iΔcixi+𝐜(Δ1)\displaystyle\displaystyle{\sum}_{i\in\Delta}c_{i}x_{i}+{\bf c}(\Delta_{1})
s.t (𝐩(Δ1)+iΔpixi)2+(𝐪(Δ1)+iΔqixi)2C,\displaystyle\displaystyle\left({\bf p}(\Delta_{1})+\sum_{i\in\Delta}p_{i}x_{i}\right)^{2}+\left({\bf q}(\Delta_{1})+\sum_{i\in\Delta}q_{i}x_{i}\right)^{2}\geq C,
xi[0,1],iΔ,\displaystyle x_{i}\in[0,1],\quad\forall~{}i\in\Delta,

where 𝐩(Δ1)=iΔ1pi,𝐪(Δ1)=iΔ1qi,𝐜(Δ1)=iΔ1ci{\bf p}(\Delta_{1})={\sum}_{i\in\Delta_{1}}p_{i},{\bf q}(\Delta_{1})={\sum}_{i\in\Delta_{1}}q_{i},{\bf c}(\Delta_{1})={\sum}_{i\in\Delta_{1}}c_{i}. It follows that the problem of checking if 𝖭𝖫𝖯[T]\mathsf{NLP}[T] has a feasible solution is equivalent to checking if there is a feasible solution to 𝖭𝖫𝖯[Δ]\mathsf{NLP}[\Delta] with iΔcixi+𝐜(Δ1)=T\sum_{i\in\Delta}c_{i}x_{i}+{\bf c}(\Delta_{1})=T. By the definition of Δ\Delta, it follows that

z1(iΔpixi)+z2(iΔqixi)=iΔcixi=T𝐜(Δ1).z_{1}^{*}\cdot\left(\sum_{i\in\Delta}p_{i}x_{i}\right)+z_{2}^{*}\cdot\left(\sum_{i\in\Delta}q_{i}x_{i}\right)={\sum}_{i\in\Delta}c_{i}x_{i}=T-{\bf c}(\Delta_{1}).

Based on this, the problem is now to check if the following nonlinear system is feasible with variables ξ=iΔpixi[0,ξ¯],ζ=iΔqixi[0,ζ¯]\xi=\sum_{i\in\Delta}p_{i}x_{i}\in[0,\overline{\xi}],~{}\zeta=\sum_{i\in\Delta}q_{i}x_{i}\in[0,\overline{\zeta}]:

{z1ξ+z2ζ=T𝐜(Δ1),(𝐩(Δ1)+ξ)2+(𝐪(Δ1)+ζ)2C.\displaystyle\left\{\begin{matrix}z^{*}_{1}\cdot\xi+z_{2}^{*}\cdot\zeta&=&~{}T-{\bf c}(\Delta_{1}),\\ ({\bf p}(\Delta_{1})+\xi)^{2}+({\bf q}(\Delta_{1})+\zeta)^{2}&\geq&~{}C.\end{matrix}\right. (5)

where ξ¯=𝐩(Δ)\overline{\xi}={\bf p}(\Delta), and ζ¯=𝐪(Δ)\overline{\zeta}={\bf q}(\Delta) are upper bounds on the values of ξ\xi and ζ\zeta, respectively.

Since we assume 𝐜>0{\bf c}>0, it follows that at least one of the dual solutions z1,z2z_{1}^{*},z_{2}^{*} must be positive. Without loss of generality, we assume z2>0z_{2}^{*}>0. To check the feasibility of (5), one can compute ζ\zeta as a function of ξ\xi from the first equation, and then plug it into the second one, leading to a quadratic inequality in one variable ξ\xi:

(z2𝐩(Δ1)+z2ξ)2+(z2𝐪(Δ1)+T𝐜(Δ1)z1ξ)2(z2)2C,\left(z_{2}^{*}{\bf p}(\Delta_{1})+z_{2}^{*}\xi\right)^{2}+\left(z_{2}^{*}{\bf q}(\Delta_{1})+T-{\bf c}(\Delta_{1})-z_{1}^{*}\xi\right)^{2}\\ \geq~{}(z_{2}^{*})^{2}C, (6)

from which one can easily get the solution of ξ\xi as

ξ[0,ξ¯]((,ξ1][ξ2,+])Ω(ξ),\xi^{*}\in[0,\overline{\xi}]~{}\bigcap\left(~{}(-\infty,\xi_{1}]\cup[\xi_{2},+\infty]\right)\equiv\Omega(\xi),

where ξ1,ξ2\xi_{1},\xi_{2} are the two (possibly identical) roots of the quadratic equation obtained by setting the inequality in (6) to an equality444Note that it is possible that the quadratic equation has no real solution in which case, we set ξ1=\xi_{1}=-\infty and ξ2=+\xi_{2}=+\infty; this means that our assumption about the feasibility of 𝖭𝖫𝖯[T]\mathsf{NLP}[T] is wrong.. Note that, depending on the signs of ξ1\xi_{1} and ξ2\xi_{2}, Ω(ξ)\Omega(\xi) consists of at most two intervals on the real line. Using each interval in Ω(ξ)\Omega(\xi) and the equation in (5), the remaining variables xix^{*}_{i}’s, iΔi\in\Delta, can be recovered by solving at most 2 linear programs, one for each choice of IΩ(ξ)I\in\Omega(\xi):

{z1(iΔpixi)+z2(iΔqixi)=T𝐜(Δ1),iΔpixiI,xi[0,1],iΔ.\displaystyle\left\{\begin{matrix}z_{1}^{*}\cdot\left(\sum_{i\in\Delta}p_{i}x_{i}\right)+z_{2}^{*}\cdot\left(\sum_{i\in\Delta}q_{i}x_{i}\right)=T-{\bf c}(\Delta_{1}),\\ \sum_{i\in\Delta}p_{i}x_{i}\in I,&&\\ x_{i}\in[0,1],~{}\quad\forall~{}i\in\Delta.\end{matrix}\right.

Note that, due to the possible non-rationality of ξ1\xi_{1} and ξ2\xi_{2}, we may need to work with δ\delta^{\prime}-approximations of ξ1\xi_{1} and ξ2\xi_{2}, i.e., compute numbers ξ~1\tilde{\xi}_{1} and ξ~2\tilde{\xi}_{2} such that |ξ1ξ~1|δ|\xi_{1}-\tilde{\xi}_{1}|\leq\delta^{\prime} and |ξ2ξ~2|δ|\xi_{2}-\tilde{\xi}_{2}|\leq\delta^{\prime}. Such approximations can be found in time 𝗉𝗈𝗅𝗒(L,log1δ)\mathsf{poly}(L,\log\frac{1}{\delta^{\prime}}), and by choosing δ\delta^{\prime} sufficiently small, we can guarantee that (6) is satisfied within an additive error of δ\delta, thus yielding a δ\delta-feasible solution for 𝖭𝖫𝖯[T]\mathsf{NLP}[T].   ❑ 

Proof.    (of Theorem 1). We rely on the partial enumeration method (Frieze and Clarke 1984). For subsets S0,S1[n]S_{0},S_{1}\subseteq[n], let 𝖭𝖫𝖯[S0,S1]\mathsf{NLP}[S_{0},S_{1}] denote the restriction of 𝖭𝖫𝖯\mathsf{NLP} when we enforce xi=0x_{i}=0 for iS0i\in S_{0} and xi=1x_{i}=1 for iS1i\in S_{1}. It is straightforward to modify Algorithm 1 to find a δ\delta-optimal solution for 𝖭𝖫𝖯[S0,S1]\mathsf{NLP}[S_{0},S_{1}]. Given the desired accuracy ϵ(0,1)\epsilon\in(0,1), we consider all subsets S1S_{1} of size at most h:=2ϵh:=\lceil\frac{2}{\epsilon}\rceil and let S0:={i[n]:ci>minjS1cj}S_{0}:=\{i\in[n]:~{}c_{i}>\min_{j\in S_{1}}c_{j}\}. For each considered subset S1S_{1}, we solve 𝖭𝖫𝖯[S0,S1]\mathsf{NLP}[S_{0},S_{1}] and round the δ\delta-optimal solution xx obtained as described below. Among all the rounded solutions obtained, we return the solution with minimum cost. Effectively, we are guessing the hh indices with the highest cic_{i} values in an optimal solution of 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP} and solving for the remaining items. Let us denote the corresponding sets in the optimal guess by S0S_{0}^{*} and S1S_{1}^{*}. Let 𝐱{\bf x}^{*} be a δ\delta-optimal solution to 𝖭𝖫𝖯[S0,S1]\mathsf{NLP}[S_{0}^{*},S_{1}^{*}], α=𝐩𝐱\alpha={\bf p}^{\top}{\bf x}^{*}, β=𝐪𝐱\beta={\bf q}^{\top}{\bf x}^{*}, and consider the following linear program (𝖫𝖯[S0,S1]{\mathsf{LP}}^{\prime}[S_{0},S_{1}]):

min𝐱[0,1]n\displaystyle\min_{{\bf x}\in[0,1]^{n}}\qquad 𝐜𝐱\displaystyle\displaystyle{\bf c}^{\top}{\bf x}
s.t 𝐩𝐱α,𝐪𝐱β,\displaystyle\displaystyle{\bf p}^{\top}{\bf x}\geq\alpha,\quad{\bf q}^{\top}{\bf x}\geq\beta,
xi=0,   for iS0,\displaystyle\text{$x_{i}=0$, ~{}~{}for~{}~{} $i\in S_{0}$},
xi=1x_{i}=1,   for   iS1i\in S_{1}.

A basic fact from Linear Programming tells us that every bounded linear program with only two non-trivial constraints has an optimal (basic feasible) solution lying at some vertex of the polytope defining the constraints, which cannot have more than two fractional components.

Observation 2

There is an optimal solution to 𝖫𝖯[S0,S1]{\mathsf{LP}}^{\prime}[S_{0},S_{1}] with at most two fractional entries.

Let zz^{*} and zz^{\prime} be the optimal values for 𝖭𝖫𝖯[S0,S1]\mathsf{NLP}[S_{0}^{*},S_{1}^{*}] and 𝖫𝖯[S0,S1]{\mathsf{LP}}^{\prime}[S_{0}^{*},S_{1}^{*}], respectively. Then, as xx^{*} is both δ\delta-optimal for 𝖭𝖫𝖯[S0,S1]\mathsf{NLP}[S_{0}^{*},S_{1}^{*}] and feasible for 𝖫𝖯[S0,S1]{\mathsf{LP}}^{\prime}[S_{0}^{*},S_{1}^{*}], we must have zz+δz^{\prime}\leq z^{*}+\delta. Hence, there exists a basic optimal solution of 𝖫𝖯[S0,S1]{\mathsf{LP}}^{\prime}[S_{0}^{*},S_{1}^{*}], with at most 22 fractional components (which can be found efficiently), which is also a δ\delta-optimal solution of 𝖭𝖫𝖯[S0,S1]\mathsf{NLP}[S_{0}^{*},S_{1}^{*}]. By rounding each of these two fractional components to 11, we obtain a δ\delta-feasible solution ^𝐱\widehat{}{\bf x} to 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP} of value 𝐜^𝐱{\bf c}^{\top}\widehat{}{\bf x} at most

z+δ+2minjS1cjz+δ+2jS1cjh(1+O(ϵ))OPT,z^{*}+\delta+2\min_{j\in S_{1}^{*}}c_{j}\leq z^{*}+\delta+\frac{2\sum_{j\in S_{1}^{*}}c_{j}}{h}\leq(1+O(\epsilon))\mathrm{OPT},

if we choose δ=ϵ22L1ϵminj[n]cj\delta=\epsilon\cdot 2^{-2L-1}\leq\epsilon\cdot\min_{j\in[n]}c_{j}. Observe that, for this choice of δ\delta, (𝐩^𝐱)2+(𝐪^𝐱)2Cδ({\bf p}^{\top}\widehat{}{\bf x})^{2}+({\bf q}^{\top}\widehat{}{\bf x})^{2}\geq C-\delta implies that (𝐩^𝐱)2+(𝐪^𝐱)2C({\bf p}^{\top}\widehat{}{\bf x})^{2}+({\bf q}^{\top}\widehat{}{\bf x})^{2}\geq C, since C((𝐩^𝐱)2+(𝐪^𝐱)2)C-(({\bf p}^{\top}\widehat{}{\bf x})^{2}+({\bf q}^{\top}\widehat{}{\bf x})^{2}) is a rational number of value at least 22L>δ2^{-2L}>\delta. We conclude that ^𝐱\widehat{}{\bf x} is feasible solution to 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP} of value at most (1+O(ϵ))OPT(1+O(\epsilon))\mathrm{OPT}. The running time is n2ϵ+O(1)LO(1)n^{\frac{2}{\epsilon}+O(1)}L^{O(1)}.   ❑ 

Pipage Rounding for MaxSUB

In this section we provide a pipage rounding procedure for MaxSUB, given a (fractional) solution of its relaxation (denoted as \mathcal{F}, obtained by changing binary variables to continuous ones). Our aim is to prove the following result.

Theorem 3

There is an αϕ22\frac{\alpha\phi^{2}}{2}-approximation algorithm for 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}, provided that there is an α\alpha-approximation algorithm for its continuous relaxation, where ϕ=21+50.618\phi=\frac{2}{1+\sqrt{5}}\approx 0.618.

To prove the Theorem 3, we construct an algorithm, which is formally presented in Algorithm 3, and prove its correctness. Before doing so, (inspired by (Klimm et al. 2022)) let us introduce two relaxations of 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}, which are needed in the performance analysis of the algorithm later on. In both relaxations, we replace the binary variables 𝐱{0,1}n{\bf x}\in\{0,1\}^{n} by 𝐱[0,1]n{\bf x}\in[0,1]^{n}. Recall that f(𝐱)=𝐱A𝐱+𝐚𝐱f({\bf x})={\bf x}^{\top}A^{*}{\bf x}+{\bf a}^{\top}{\bf x}, define

(1):\displaystyle(\mathcal{F}_{1}):~{} max𝐱[0,1]n{f(𝐱)|𝐱Q𝐱C,𝐪𝐱C},\displaystyle{\max}_{{\bf x}\in[0,1]^{n}}\{f({\bf x})~{}|~{}{\bf x}^{\top}Q{\bf x}\leq C,~{}{\bf q}^{\top}{\bf x}\leq C\},
(2):\displaystyle(\mathcal{F}_{2}):~{} max𝐱[0,1]n{f(𝐱)|𝐱Q𝐱+𝐪𝐱C},\displaystyle{\max}_{{\bf x}\in[0,1]^{n}}\{f({\bf x})~{}|~{}{\bf x}^{\top}Q^{*}{\bf x}+{\bf q}^{\top}{\bf x}\leq C\},

where 𝐪{\bf q} is the vector containing all the diagonal entries of QQ. It is worth noting that, because of the non-negativity of QQ, \mathcal{F} and 1\mathcal{F}_{1} are equivalent.

Lemma 2

Every feasible integer solution to the original problem is feasible to both relaxations. Also, for every feasible solution 𝐱{\bf x} to 1\mathcal{F}_{1}, ϕ𝐱\phi\cdot{\bf x} forms a feasible solution to 2\mathcal{F}_{2}. Furthermore, the objective value of ϕ𝐱\phi\cdot{\bf x} is at least ϕ2\phi^{2} times the optimum of 1\mathcal{F}_{1}.

Proof.    The first claim is easy to see. For the second one, note that

(ϕ𝐱)Q(ϕ𝐱)+𝐪(ϕ𝐱)=\displaystyle(\phi{\bf x})^{\top}Q^{*}(\phi{\bf x})+{\bf q}^{\top}(\phi{\bf x})=~{} ϕ2𝐱Q𝐱+ϕ𝐪𝐱\displaystyle\phi^{2}{\bf x}^{\top}Q{\bf x}+\phi{\bf q}^{\top}{\bf x}
\displaystyle\leq~{} ϕ2C+ϕC=C,\displaystyle\phi^{2}\cdot C+\phi\cdot C=C,

where the last equality follows from the definition of ϕ\phi. It remains to show the last claim. We have that (ϕ𝐱)A(ϕ𝐱)+𝐚(ϕ𝐱)=ϕ2(𝐱A𝐱)+ϕ𝐚𝐱ϕ2(𝐱A𝐱+𝐚𝐱),(\phi{\bf x})^{\top}A^{*}(\phi{\bf x})+{\bf a}^{\top}(\phi{\bf x})=\phi^{2}({\bf x}^{\top}A^{*}{\bf x})+\phi{\bf a}^{\top}{\bf x}\geq\phi^{2}\cdot({\bf x}^{\top}A^{*}{\bf x}+{\bf a}^{\top}{\bf x}), since 𝐚𝐱0{\bf a}^{\top}{\bf x}\geq 0, and ϕ<1\phi<1.   ❑ 

Performance analysis of Algorithm 3.

Let g(𝐱):=𝐱Q𝐱+𝐪𝐱g({\bf x}):={\bf x}^{\top}Q^{*}{\bf x}+{\bf q}^{\top}{\bf x}. Suppose that 𝐱{\bf x} is the solution returned at Step 3 of Algorithm 3. Then, it holds that f(𝐱)αf(~𝐱)αf(𝐱)f({\bf x})\geq\alpha f(\tilde{}{\bf x})\geq\alpha f({\bf x}^{*}), where 𝐱{\bf x}^{*} and ~𝐱\tilde{}{\bf x} are, respectively, optimal solutions of 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB} and its relaxation 1\mathcal{F}_{1}. By Lemma 2, we have that 𝐲=ϕ𝐱{\bf y}=\phi\cdot{\bf x} is feasible to (2)(\mathcal{F}_{2}). By the correctness of the pipage rounding, to be proved below, O(n)O(n) applications of Algorithm 4 result in a solution 𝐲{\bf y}^{\prime} such that f(𝐲)f(𝐲)f({\bf y}^{\prime})\geq f({\bf y}) and g(𝐲)g(𝐲)C.g({\bf y}^{\prime})\leq g({\bf y})\leq C. Hence, the rounded solution ¯𝐲\overline{}{\bf y} is feasible for the original problem 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}.

Now let ¯𝐱\overline{}{\bf x} be the integer solution returned by Algorithm 3. Clearly, f(¯𝐱)f(¯𝐲)f(\overline{}{\bf x})\geq f(\overline{}{\bf y}). If f(¯𝐲)f(𝐲)f(\overline{}{\bf y})\geq f({\bf y}^{\prime}) then we are done. Otherwise, suppose that f(¯𝐲)f(𝐲)f(¯𝐲)f(\overline{}{\bf y})\leq f({\bf y}^{\prime})\leq f(\overline{}{\bf y}^{\prime}), where ¯𝐲\overline{}{\bf y}^{\prime} is obtained by rounding up the fractional component in 𝐲{\bf y}^{\prime}, says yiy^{\prime}_{i}, to 11. By the submodularity of ff, it holds that

f(¯𝐲)f(¯𝐲)+f(𝐱{i})f(¯𝐲)+f(𝐱{i}),f(\overline{}{\bf y}^{\prime})\leq f(\overline{}{\bf y})+f({\bf x}^{\{i\}})\leq f(\overline{}{\bf y})+f({\bf x}^{\{i^{*}\}}),

where ii^{*} is defined as in Step 9 of Algorithm 3. 555Recall that 𝐱S{\bf x}^{S} denotes the binary vector with xiS=1x_{i}^{S}=1 if and only if iSi\in S.

By the definition of ¯𝐱\overline{}{\bf x}, it holds that

f(¯𝐱)12(f(¯𝐲)+f(𝐱{i}))12f(¯𝐲)12f(𝐲).f(\overline{}{\bf x})\geq\frac{1}{2}(f(\overline{}{\bf y})+f({\bf x}^{\{i^{*}\}}))\geq\frac{1}{2}f(\overline{}{\bf y}^{\prime})\geq\frac{1}{2}f({\bf y}^{\prime}).

Furthermore, by the definition of 𝐲{\bf y}^{\prime} and 𝐲{\bf y} and Lemma 2, we must have that

f(𝐲)f(𝐲)ϕ2f(𝐱)ϕ2αf(𝐱).f({\bf y}^{\prime})\geq f({\bf y})\geq\phi^{2}\cdot f({\bf x})\geq\phi^{2}\alpha\cdot f({\bf x}^{*}).

Hence, f(¯𝐱)αϕ22f(𝐱)f(\overline{}{\bf x})\geq\frac{\alpha\phi^{2}}{2}\cdot f({\bf x}^{*}).

Algorithm 3 Approx. Algorithm for 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}
1:An α\alpha-approximation algorithm 𝒜\mathcal{A} for the relaxation of 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}
2:An integer solution ¯𝐱\overline{}{\bf x} to 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}
3:Find a (fractional) solution 𝐱{\bf x} to the relaxation 1\mathcal{F}_{1} (or \mathcal{F}), using algorithm 𝒜\mathcal{A}
4:𝐲ϕ𝐱{\bf y}\leftarrow\phi\cdot{\bf x}
5:while there is a pair (yi,yj)(0,1)2(y_{i},y_{j})\in(0,1)^{2} do
6:     Apply Algorithm 4 to round (yi,yj)(y_{i},y_{j})
7:Let 𝐲{\bf y}^{\prime} be the solution with at most one fractional component
8:Set the (single) fractional component in 𝐲{\bf y}^{\prime} (if any) to 0 to get an integer solution ¯𝐲\overline{}{\bf y}
9:iargmaxi[n]{ai}i^{*}\leftarrow\arg\max_{i\in[n]}\{a_{i}\}
10:¯𝐱argmax{f(¯𝐲),f(𝐱{i})}\overline{}{\bf x}\leftarrow\arg\max\{f(\overline{}{\bf y}),f({\bf x}^{\{i^{*}\}})\}
11:return ¯𝐱\overline{}{\bf x}
Algorithm 4 Pipage-Rounding-Procedure
1:A pair (xi,xj)(0,1)2(x_{i},x_{j})\in(0,1)^{2}; φ=aijxixj+aixi+ajxj;ψ=qijxixj+qixi+qjxj\varphi=a_{ij}x_{i}x_{j}+a^{\prime}_{i}x_{i}+a^{\prime}_{j}x_{j};~{}\psi=q_{ij}x_{i}x_{j}+q^{\prime}_{i}x_{i}+q^{\prime}_{j}x_{j}.
2:A pair (x¯i,x¯j)[0,1]2(\overline{x}_{i},\overline{x}_{j})\in[0,1]^{2} with at most one fractional component
3:φiaijxj+ai\varphi^{\prime}_{i}\leftarrow a_{ij}x_{j}+a_{i}^{\prime}; φjaijxi+aj\varphi^{\prime}_{j}\leftarrow a_{ij}x_{i}+a_{j}^{\prime}
4:ψiqijxj+qi\psi^{\prime}_{i}\leftarrow q_{ij}x_{j}+q_{i}^{\prime}; ψjqijxi+qj\psi^{\prime}_{j}\leftarrow q_{ij}x_{i}+q_{j}^{\prime}
5:if φ0\varphi\leq 0 then x¯i0\overline{x}_{i}\leftarrow 0 and x¯j0\overline{x}_{j}\leftarrow 0
6:if ai0a^{\prime}_{i}\leq 0 or φi0\varphi^{\prime}_{i}\leq 0 then x¯i0\overline{x}_{i}\leftarrow 0
7:if aj0a^{\prime}_{j}\leq 0 or φj0\varphi^{\prime}_{j}\leq 0 then x¯j0\overline{x}_{j}\leftarrow 0
8:if qij+qi=0q_{ij}+q_{i}^{\prime}=0 then x¯i𝖺𝗋𝗀𝗆𝖺𝗑xi{0,1}φ(xi,xj)\overline{x}_{i}\leftarrow\mathsf{argmax}_{x_{i}\in\{0,1\}}\varphi(x_{i},x_{j})
9:if qij+qj=0q_{ij}+q_{j}^{\prime}=0 then x¯j𝖺𝗋𝗀𝗆𝖺𝗑xj{0,1}φ(xi,xj)\overline{x}_{j}\leftarrow\mathsf{argmax}_{x_{j}\in\{0,1\}}\varphi(x_{i},x_{j})
10:if φ,φi,φj,ai,aj>0\varphi,\varphi^{\prime}_{i},\varphi^{\prime}_{j},a_{i}^{\prime},a_{j}^{\prime}>0, qij+qi>0q_{ij}+q_{i}^{\prime}>0 and qij+qj>0q_{ij}+q_{j}^{\prime}>0 then
11:     if φaj\varphi\leq a_{j}^{\prime} and φiψiajqj\dfrac{\varphi^{\prime}_{i}}{\psi^{\prime}_{i}}\leq\dfrac{a_{j}^{\prime}}{q_{j}^{\prime}} then \triangleright Case 1
12:         x¯i0\overline{x}_{i}\leftarrow 0, x¯jmin{xj+xiψiqj,1}\overline{x}_{j}\leftarrow\min\left\{x_{j}+\dfrac{x_{i}\psi^{\prime}_{i}}{q_{j}^{\prime}},1\right\}      
13:     if φai\varphi\leq a_{i}^{\prime} and φjψjaiqi\dfrac{\varphi^{\prime}_{j}}{\psi^{\prime}_{j}}\leq\dfrac{a_{i}^{\prime}}{q_{i}^{\prime}} then\triangleright Case 2
14:         x¯imin{xi+xjψjqi,1}\overline{x}_{i}\leftarrow\min\left\{x_{i}+\dfrac{x_{j}\psi^{\prime}_{j}}{q_{i}^{\prime}},1\right\}, x¯j0\overline{x}_{j}\leftarrow 0      
15:     if ψqi\psi\geq q_{i}^{\prime} and φiψiaj+aijqj+qij\dfrac{\varphi^{\prime}_{i}}{\psi^{\prime}_{i}}\geq\dfrac{a_{j}^{\prime}+a_{ij}}{q_{j}^{\prime}+q_{ij}} then \triangleright Case 3
16:         if aij+aj0a_{ij}+a^{\prime}_{j}\leq 0 then x¯i1\overline{x}_{i}\leftarrow 1, x¯j0\overline{x}_{j}\leftarrow 0
17:         elsex¯i1,x¯jmax{xj(1xi)φiaij+aj,0}\overline{x}_{i}\leftarrow 1,\overline{x}_{j}\leftarrow\max\left\{x_{j}-\dfrac{(1-x_{i})\varphi^{\prime}_{i}}{a_{ij}+a^{\prime}_{j}},0\right\}               
18:     if ψqj\psi\geq q_{j}^{\prime} and φjψjai+aijqi+qij\dfrac{\varphi^{\prime}_{j}}{\psi^{\prime}_{j}}\geq\dfrac{a_{i}^{\prime}+a_{ij}}{q_{i}^{\prime}+q_{ij}} then \triangleright Case 4
19:         if aij+ai0a_{ij}+a^{\prime}_{i}\leq 0 then x¯i0\overline{x}_{i}\leftarrow 0, x¯j1\overline{x}_{j}\leftarrow 1
20:         else x¯j1,x¯imax{xi(1xj)φjaij+ai,0}\overline{x}_{j}\leftarrow 1,\overline{x}_{i}\leftarrow\max\left\{x_{i}-\dfrac{(1-x_{j})\varphi^{\prime}_{j}}{a_{ij}+a^{\prime}_{i}},0\right\}               
21:return (x¯i,x¯j)(\overline{x}_{i},\overline{x}_{j})

Correctness of the pipage rounding (Algorithm 4)

We prove that at the end of the pipage rounding procedure at most one component of the resulting solution 𝐲{\bf y}^{\prime} is fractional. To this end, we show that, at each step of the rounding, at least one of the fractional variables is rounded to either 0 or 11, without decreasing f(𝐱)f({\bf x}) or increasing g(𝐱)g({\bf x}). Note that the value of a fractional variable and its coefficient (which can be a function of other variables) may be changed after each rounding step. In addition, once a variable has been rounded to 0 or 11, it will never be considered in the next steps. In particular, the while loop in Algorithm 3 runs for at most nn iterations. Suppose that we are at step kk of the rounding procedure, with two fractional variables xi(k),xj(k)(0,1)x^{(k)}_{i},x^{(k)}_{j}\in(0,1). For ease of presentation, we denote by φ(xi(k),xj(k))\varphi(x^{(k)}_{i},x^{(k)}_{j}) and ψ(xi(k),xj(k))\psi(x^{(k)}_{i},x^{(k)}_{j}) the parts of ff and gg, respectively, that contain these two variables. Formally, let

φ(xi(k),xj(k))=\displaystyle\varphi(x^{(k)}_{i},x^{(k)}_{j})=~{} aijxi(k)xj(k)+ai(k)xi(k)+aj(k)xj(k),\displaystyle a_{ij}x^{(k)}_{i}x^{(k)}_{j}+a^{(k)}_{i}x^{(k)}_{i}+a^{(k)}_{j}x^{(k)}_{j},
ψ(xi(k),xj(k))=\displaystyle\psi(x^{(k)}_{i},x^{(k)}_{j})=~{} qijxi(k)xj(k)+qi(k)xi(k)+qj(k)xj(k),\displaystyle q_{ij}x^{(k)}_{i}x^{(k)}_{j}+q^{(k)}_{i}x^{(k)}_{i}+q^{(k)}_{j}x^{(k)}_{j},

be the functions of xi(k)x^{(k)}_{i} and xj(k)x^{(k)}_{j}, where aij0a_{ij}\leq 0, qi,j,qi(k),qj(k)0q_{i,j},q^{(k)}_{i},q^{(k)}_{j}\geq 0. We may use φi,φj,ψi,ψj\varphi^{\prime}_{i},\varphi^{\prime}_{j},\psi^{\prime}_{i},\psi^{\prime}_{j} to denote the first order derivatives of φ\varphi and ψ\psi. To simplify the notation, we drop the superscript “kk” and use aia_{i}^{\prime}, aja_{j}^{\prime}, qiq_{i}^{\prime} and qjq_{j}^{\prime} to denote ai(k)a_{i}^{(k)},aj(k)a_{j}^{(k)}, qi(k)q_{i}^{(k)} and qj(k)q_{j}^{(k)} respectively. It suffices to prove that one can round the two variables xi,xjx_{i},x_{j} such that at least one of them goes down to 0 or up to 11, without decreasing φ(xi,xj)\varphi(x_{i},x_{j}) or increasing ψ(xi,xj)\psi(x_{i},x_{j}).

  • If φ(xi,xj)0\varphi(x_{i},x_{j})\leq 0 then rounding down both variables xi,xjx_{i},x_{j} to 0 does not decrease φ(xi,xj)\varphi(x_{i},x_{j}) or increase ψ(xi,xj)\psi(x_{i},x_{j}). Thus, we may assume that φ(xi,xj)>0\varphi(x_{i},x_{j})>0.

  • If ai0a_{i}^{\prime}\leq 0 then set x¯i=0\overline{x}_{i}=0, and similarly for aja_{j}. Thus, we may assume that ai,aj>0a_{i}^{\prime},a_{j}^{\prime}>0.

  • If φi:=aijxj+ai0\varphi^{\prime}_{i}:=a_{ij}x_{j}+a_{i}^{\prime}\leq 0 then set x¯i=0\overline{x}_{i}=0, and similarly for φj:=aijxi+aj\varphi^{\prime}_{j}:=a_{ij}x_{i}+a_{j}^{\prime}. Thus, we may assume that φi>0\varphi^{\prime}_{i}>0 and φj>0\varphi^{\prime}_{j}>0.

  • If qij=qi=0q_{ij}=q_{i}^{\prime}=0, we set x¯i{0,1}\overline{x}_{i}\in\{0,1\} so as to maximize φ(xi,xj)\varphi(x_{i},x_{j}) (which is linear in xix_{i}). This guarantees that φ(x¯i,xj)φ(xi,xj)\varphi(\overline{x}_{i},x_{j})\geq\varphi(x_{i},x_{j}) while ψ(x¯i,xj)=ψ(xi,xj)\psi(\overline{x}_{i},x_{j})=\psi(x_{i},x_{j}). Thus we may assume that qij+qi>0q_{ij}+q_{i}^{\prime}>0 and similarly qij+qj>0q_{ij}+q_{j}^{\prime}>0.

Note that ψ(xi,xj),ψi(xi,xj),ψj(xi,xj)0\psi(x_{i},x_{j}),\psi^{\prime}_{i}(x_{i},x_{j}),\psi^{\prime}_{j}(x_{i},x_{j})\geq 0. The following lemma completes the proof of the correctness of the pipage rounding.

Lemma 3

Suppose that ai,aj>0a_{i}^{\prime},a_{j}^{\prime}>0, and φ(xi,xj)\varphi(x_{i},x_{j}) together with its first derivatives φi(xi,xj)\varphi^{\prime}_{i}(x_{i},x_{j}) and φj(xi,xj)\varphi^{\prime}_{j}(x_{i},x_{j}) and all strictly positive. Then there always exist two parameters α[xi,1xi],β[xj,1xj]\alpha\in[-x_{i},1-x_{i}],\beta\in[-x_{j},1-x_{j}] so as to satisfy simultaneously both inequalities φ(xi+α,xj+β)φ(xi,xj)\varphi(x_{i}+\alpha,x_{j}+\beta)\geq\varphi(x_{i},x_{j}) and ψ(xi+α,xj+β)ψ(xi,xj)\psi(x_{i}+\alpha,x_{j}+\beta)\geq\psi(x_{i},x_{j}), with at least one of α,β\alpha,\beta being held at the boundary of its domain.

Proof.    We have that

{φ(xi+α,xj+β)φ(xi,xj),ψ(xi+α,xj+β)ψ(xi,xj),\displaystyle\left\{\begin{matrix}\varphi(x_{i}+\alpha,x_{j}+\beta)\geq\varphi(x_{i},x_{j}),\\ \psi(x_{i}+\alpha,x_{j}+\beta)\leq\psi(x_{i},x_{j}),\end{matrix}\right. (7)
{aijαβ+φiα+φjβ0,qijαβ+ψiα+ψjβ0,\displaystyle\Longleftrightarrow\left\{\begin{matrix}a_{ij}\alpha\beta+\varphi^{\prime}_{i}\alpha+\varphi^{\prime}_{j}\beta\geq 0,\\ q_{ij}\alpha\beta+\psi^{\prime}_{i}\alpha+\psi^{\prime}_{j}\beta\leq 0,\end{matrix}\right. (8)

where ψi:=qijxj+qi\psi^{\prime}_{i}:=q_{ij}x_{j}+q_{i}^{\prime} and ψj:=qijxi+qj\psi^{\prime}_{j}:=q_{ij}x_{i}+q_{j}^{\prime}. We will fix one of the variables α\alpha or β\beta to one of its two boundary values, and assign the other variable a value in its range such that (7) is satisfied. To this end, we consider four possible cases below.

  • Case 1: α=xi\alpha=-x_{i}. Plugging this into (7) implies φiajβψiqj,\frac{\varphi^{\prime}_{i}}{a_{j}^{\prime}}\leq\beta\leq\frac{\psi^{\prime}_{i}}{q_{j}^{\prime}}, which are valid inequalities (even if qj=0q_{j}=0) for some β[xj,1xj]\beta\in[-x_{j},1-x_{j}] if and only if the following system is feasible

    (I1){φ(xi,xj)aj,(I1.1)φiψiajqj.(I1.2)}\text{(I1)}~{}\left\{\varphi(x_{i},x_{j})\leq a_{j}^{\prime},~{}\text{(I1.1)}~{}\bigwedge~{}\dfrac{\varphi^{\prime}_{i}}{\psi^{\prime}_{i}}\leq\dfrac{a_{j}^{\prime}}{q_{j}^{\prime}}.~{}\text{(I1.2)}\right\}
  • Case 2: β=xj\beta=-x_{j}. Using a similar argument as in Case 1, there is a value of α[xi,1xi]\alpha\in[-x_{i},1-x_{i}] satisfying (7) if and only if the following system is feasible

    (I2){φ(xi,xj)ai,(I2.1)φjψjaiqi.(I2.2)}\text{(I2)}~{}\left\{\varphi(x_{i},x_{j})\leq a_{i}^{\prime},~{}\text{(I2.1)}~{}\bigwedge~{}\dfrac{\varphi^{\prime}_{j}}{\psi^{\prime}_{j}}\leq\dfrac{a_{i}^{\prime}}{q_{i}^{\prime}}.~{}\text{(I2.2)}\right\}
  • Case 3: α=1xi\alpha=1-x_{i}. Plugging this into (7) implies

    {(aij+aj)β+(1xi)φi0,(qij+qj)β+(1xi)ψi0.\displaystyle\left\{\begin{matrix}(a_{ij}+a_{j}^{\prime})\beta+(1-x_{i})\varphi^{\prime}_{i}\geq 0,\\ (q_{ij}+q_{j}^{\prime})\beta+(1-x_{i})\psi^{\prime}_{i}\leq 0.\end{matrix}\right. (9)

    Without loss of generality aij+aj0a_{ij}+a_{j}^{\prime}\neq 0 (otherwise, it becomes only easier to find a feasible value of β\beta). Using similar calculations as in the previous cases, it can be shown (regardless of the sign of aij+aja_{ij}+a^{\prime}_{j}) that there is β[xj,1xj]\beta\in[-x_{j},1-x_{j}] for which (9) is satisfied if and only if the following system is feasible

    (I3){ψ(xi,xj)qi,(I3.1)φiψiaj+aijqj+qij.(I3.2)}\text{(I3)}~{}\left\{\psi(x_{i},x_{j})\geq q_{i}^{\prime},~{}\text{(I3.1)}~{}\bigwedge~{}\dfrac{\varphi^{\prime}_{i}}{\psi^{\prime}_{i}}\geq\dfrac{a_{j}^{\prime}+a_{ij}}{q_{j}^{\prime}+q_{ij}}.~{}\text{(I3.2)}\right\}
  • Case 4: β=1xj\beta=1-x_{j}. Similar to Case 3, there is a value of α[xi,1xi]\alpha\in[-x_{i},1-x_{i}] satisfying (7) if and only if the following system is feasible

    (I4){ψ(xi,xj)qj,(I4.1)φjψjai+aijqi+qij.(I4.2).}\text{(I4)}~{}\left\{\psi(x_{i},x_{j})\geq q_{j}^{\prime},~{}\text{(I4.1)}~{}\bigwedge~{}\dfrac{\varphi^{\prime}_{j}}{\psi^{\prime}_{j}}\geq\dfrac{a_{i}^{\prime}+a_{ij}}{q_{i}^{\prime}+q_{ij}}.~{}\text{(I4.2)}.\right\}

We now prove, via contradiction, that at least one of the systems (I1), (I2), (I3), or (I4) is feasible. Before doing so, we observe some logical relations between some of the inequalities in these systems. In the following, for an inequality (I){\in\{(I1.1),…,(I4.2)}\}, we overload notation by writing (I) to mean that the inequality holds, and use (I¯\overline{\text{I}}) to denote that the reverse inequality holds. The following observations are easy to verify:

  • L1:

    [\big{[}(I1.2) \vee (I2.2)]\big{]}: indeed, suppose that this is not the case, that is

    {qjφi>ajψiqiφj>aiψj.\displaystyle\left\{\begin{matrix}q_{j}^{\prime}\varphi^{\prime}_{i}>a_{j}^{\prime}\psi^{\prime}_{i}\\ q_{i}^{\prime}\varphi^{\prime}_{j}>a_{i}^{\prime}\psi^{\prime}_{j}.\end{matrix}\right. (10)
    {qjaijxj+aiqj>ajqijxj+ajqiqiaijxi+ajqi>aiqijxi+aiqj\displaystyle\Longleftrightarrow\left\{\begin{matrix}q_{j}^{\prime}a_{ij}x_{j}+a_{i}q_{j}>a_{j}^{\prime}q_{ij}x_{j}+a_{j}^{\prime}q_{i}^{\prime}\\ q_{i}^{\prime}a_{ij}x_{i}+a_{j}q_{i}>a_{i}^{\prime}q_{ij}x_{i}+a_{i}^{\prime}q_{j}^{\prime}\end{matrix}\right. (11)

    It follows that aij(qixi+qjxj)>qij(aixi+ajxj)a_{ij}(q_{i}^{\prime}x_{i}+q_{j}^{\prime}x_{j})>q_{ij}(a_{i}^{\prime}x_{i}+a_{j}^{\prime}x_{j}), which does not hold as aij0a_{ij}\leq 0.

  • L2:

    Similarly, we have [(I3.2) \vee (I4.2)]: indeed, suppose that this is not the case. Then,

    φiφi<\displaystyle\dfrac{\varphi^{\prime}_{i}}{\varphi^{\prime}_{i}}< aj+aijqj+qij\displaystyle~{}\dfrac{a_{j}^{\prime}+a_{ij}}{q_{j}^{\prime}+q_{ij}}
    aijqjxj+aiqj+aiqij<\displaystyle\Longleftrightarrow a_{ij}q_{j}^{\prime}x_{j}+a_{i}q_{j}+a_{i}^{\prime}q_{ij}< ajqijxj+qiaj+aijqi\displaystyle~{}a_{j}^{\prime}q_{ij}x_{j}+q_{i}^{\prime}a_{j}^{\prime}+a_{ij}q_{i}^{\prime}

    and

    φjψj<\displaystyle\dfrac{\varphi^{\prime}_{j}}{\psi^{\prime}_{j}}< ai+aijqi+qij\displaystyle~{}\dfrac{a_{i}^{\prime}+a_{ij}}{q_{i}^{\prime}+q_{ij}}
    aijqixi+ajqi+ajqij<\displaystyle\Longleftrightarrow a_{ij}q_{i}^{\prime}x_{i}+a_{j}q_{i}+a_{j}^{\prime}q_{ij}< aiqijxi+aiqj+aijqj.\displaystyle~{}a_{i}^{\prime}q_{ij}x_{i}+a_{i}^{\prime}q_{j}+a_{ij}q_{j}^{\prime}.

    Collecting terms, we get

    aij[qi(1xi)+qj(1xj)]>qij[ai(1xi)+aj(1xj)],a_{ij}\left[q_{i}^{\prime}(1-x_{i})+q_{j}^{\prime}(1-x_{j})\right]>q_{ij}\left[a_{i}^{\prime}(1-x_{i})+a_{j}^{\prime}(1-x_{j})\right],

    which is a contradiction as aij0a_{ij}\leq 0 and xi,xj<1x_{i},x_{j}<1.

  • L3:

    [\big{[}(I1.1¯\overline{\text{I1.1}}) \wedge (I1.2)]\big{]} \Longrightarrow (I4.1): indeed, the antecedent of this implication imposes that

    aj(1xj)<aijxixj+aixi=xiφixiajψiqj,\displaystyle a_{j}^{\prime}(1-x_{j})<a_{ij}x_{i}x_{j}+a_{i}^{\prime}x_{i}=x_{i}\varphi^{\prime}_{i}\leq x_{i}\cdot\frac{a_{j}^{\prime}\psi^{\prime}_{i}}{q_{j}^{\prime}},

    giving qj<qjxj+xiψ=ψ(xi,xj)q_{j}^{\prime}<q_{j}^{\prime}x_{j}+x_{i}\psi^{\prime}=\psi(x_{i},x_{j}), which satisfies (I4.1).

  • L4:

    Similarly, [\big{[}(I2.2) \wedge (I3.1¯\overline{\text{I3.1}})]\big{]} \Longrightarrow (I2.1): indeed, the antecedent of the implication imposes that

    qi(1xi)>qijxixj+qjxj=xjψjxjqiφjai,\displaystyle q_{i}^{\prime}(1-x_{i})>q_{ij}x_{i}x_{j}+q_{j}^{\prime}x_{j}=x_{j}\psi^{\prime}_{j}\geq x_{j}\cdot\frac{q_{i}^{\prime}\varphi^{\prime}_{j}}{a_{i}^{\prime}},

    implying that qi>0q_{i}^{\prime}>0, and consequently giving ai>aixi+xjφj=φ(xi,xj)a_{i}^{\prime}>a_{i}^{\prime}x_{i}+x_{j}\varphi^{\prime}_{j}=\varphi(x_{i},x_{j}), which satisfies (I2.1).

  • L5:

    [\big{[}(I2.2) \vee (I4.2)]\big{]}: indeed, suppose that this is not the case. Then we get a contradiction:

    aiqiai+aijqi+qij>φjψj>aiqi,\dfrac{a_{i}^{\prime}}{q_{i}^{\prime}}\geq\dfrac{a_{i}^{\prime}+a_{ij}}{q_{i}^{\prime}+q_{ij}}>\dfrac{\varphi^{\prime}_{j}}{\psi^{\prime}_{j}}>\dfrac{a_{i}^{\prime}}{q_{i}^{\prime}},

    as aij0a_{ij}\leq 0, and aij,ai,qi,qij0a_{ij},a_{i}^{\prime},q_{i}^{\prime},q_{ij}\geq 0.

Now, suppose that none of the systems (I1), (I2), (I3), and (I4) is feasible. W.l.o.g. assume that (I1.2) holds (the case when (I2.2) holds can be done similarly by exchanging the roles of the indices ii and jj). Then we get the following sequence of implications:

(I1.2)(I1.1¯)L3(I4.1)(I4.2¯)\displaystyle({\text{I1.2}})~{}\Longrightarrow~{}(\overline{\text{I1.1}})~{}\stackrel{{\scriptstyle L3}}{{\Longrightarrow}}~{}(\text{I4.1})~{}\Longrightarrow~{}(\overline{\text{I4.2}})
L2(I3.2)(I3.1¯)L4(I2.2¯)L5(I4.2),\displaystyle~{}\stackrel{{\scriptstyle L2}}{{\Longrightarrow}}~{}(\text{I3.2})~{}{\Longrightarrow}~{}(\overline{\text{I3.1}})~{}\stackrel{{\scriptstyle L4}}{{\Longrightarrow}}~{}(\overline{\text{I2.2}})~{}\stackrel{{\scriptstyle L5}}{{\Longrightarrow}}~{}(\text{I4.2}),

leading to the contradiction that the system (I4) is feasible.   ❑ 

Theorem 3 paves the way for designing approximation algorithms for 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}, based on solving (approximately or exactly) its relaxation \mathcal{F}. We are not aware of any algorithm for efficiently solving such a relaxation. However, for general convex constraint (where QQ is a general non-negative PSD matrix), one can show a constant approximation algorithm by taking advantage of the result of (Bian et al. 2017a).

Theorem 4 ((Bian et al. 2017a))

There is an iterative greedy algorithm for maximizing a (continuous) DR-submodular function ff subject to a down-closed convex set SS, such that it proceeds in KK steps (each taking polynomial time), and outputs a solution 𝐱{\bf x} such that f(𝐱)1ef(𝐱)LD22KO(1K2)f(𝐱)f({\bf x})\geq\frac{1}{e}f({\bf x}^{*})-\frac{LD^{2}}{2K}-O(\frac{1}{K^{2}})f({\bf x}^{*}), where 𝐱{\bf x}^{*} is an optimal solution, LL is a constant such that f(𝐱)f(𝐲)2L𝐱𝐲2\|\nabla f({\bf x})-\nabla f({\bf y})\|_{2}\leq L\|{\bf x}-{\bf y}\|_{2}, and D=max𝐱,𝐲S𝐱𝐲2D=\max_{{\bf x},{\bf y}\in S}\|{\bf x}-{\bf y}\|_{2}.

Here, a convex set SS is said to be down-closed if for any 𝐱,𝐲+n{\bf x},{\bf y}\in\mathbb{R}_{+}^{n}, 𝐱𝐲{\bf x}\leq{\bf y} and 𝐲S{\bf y}\in S implies that 𝐱S{\bf x}\in S. Also, a twice-differentiable function f:𝒳f:\mathcal{X}\rightarrow\mathbb{R} is DR-submodular if and only if ij2f(𝐱)0\nabla^{2}_{ij}f({\bf x})\leq 0 for 𝐱𝒳\forall{\bf x}\in\mathcal{X} and i,j[n]i,j\in[n] (see (Bian et al. 2017b) for more details). When applied to our problem, the above result gives the following.

Corollary 1

There is a (1eO(ϵ))(\frac{1}{e}-O(\epsilon))-approximation algorithm for maximizing f(𝐱)f({\bf x}) under the constraints 𝐱Q𝐱C,𝐪𝐱C,𝐱[0,1]n{\bf x}^{\top}Q{\bf x}\leq C,~{}{\bf q}^{\top}{\bf x}\leq C,~{}{\bf x}\in[0,1]^{n}, when A𝟎A^{*}\leq{\bf 0}, Q𝟎Q\geq{\bf 0} and QQ is PSD.

Proof.    Clearly, the convex set defined by {𝐱[0,1]n:𝐱Q𝐱C,𝐪𝐱C}\{{\bf x}\in[0,1]^{n}:~{}{\bf x}^{\top}Q{\bf x}\leq C,{\bf q}^{\top}{\bf x}\leq C\} with non-negative PSD matrix QQ, and positive number CC, is down-closed. We now derive upper bounds on the values of the parameters LL and DD defined in Theorem 4. The gradient of ff can be computed as f(𝐱)=A𝐱+𝐚\nabla f({\bf x})=A^{*}{\bf x}+{\bf a}, and thus

f(𝐱)f(𝐲)2=A(𝐱𝐲)2\displaystyle\|\nabla f({\bf x})-\nabla f({\bf y})\|_{2}=\|A^{*}({\bf x}-{\bf y})\|_{2}\leq AF𝐱𝐲2\displaystyle~{}\|A^{*}\|_{F}\cdot\|{\bf x}-{\bf y}\|_{2}
nλ𝐱𝐲2,\displaystyle~{}\leq n\lambda\cdot\|{\bf x}-{\bf y}\|_{2},

where AF\|A^{*}\|_{F} denotes the Frobenius norm of AA^{*}, λ\lambda is the maximum absolute value of entries in AA^{*}, and the first inequality follows from the Cauchy-Schwarz inequality. This gives LnλL\leq n\lambda. On the other hand, in our case of 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}, DD is upper bounded by the maximum length of any feasible point in the hypercube [0,1]n[0,1]^{n}, which equals n\sqrt{n}. To show the approximation factor, it is noted that f(𝐱)aif({\bf x}^{*})\geq a_{i}, for all i[n]i\in[n], because of the feasibility of any unit vector 𝐱{i}{\bf x}^{\{i\}} (by the non-negativity assumption on QQ). Also, by the non-negativity of the objective function, it holds that f(𝐱{i,j})=ai+aj+aij+aji0f({\bf x}^{\{i,j\}})=a_{i}+a_{j}+a_{ij}+a_{ji}\geq 0 for every (i,j)[n]2(i,j)\in[n]^{2}, which implies that |aij|,|aji|2max{ai,aj}|a_{ij}|,|a_{ji}|\leq 2\max\{a_{i},a_{j}\}. Therefore, f(𝐱)12λf({\bf x}^{*})\geq\frac{1}{2}\lambda. By plugging in K:=n2ϵK:=\frac{n^{2}}{\epsilon} in Theorem 4, one can get a (1eO(ϵ))(\frac{1}{e}-O(\epsilon)) approximation factor as desired.   ❑ 

Now by applying Corollary 1 and Theorem 3, one can obtain the following result for 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}. Note that the objective function f(𝐱)=𝐱A𝐱+𝐚𝐱f({\bf x})={\bf x}^{\top}A{\bf x}+{\bf a}^{\top}{\bf x} can be written as 𝐱A𝐱+𝐚𝐱{\bf x}^{\top}A^{*}{\bf x}+{\bf a}^{\top}{\bf x} for binary variables, which exhibits DR-submodularity over [0,1]n[0,1]^{n} as A<𝟎A^{*}<{\bf 0}.

Theorem 5

There is an (ϕ22eO(ϵ))(\frac{\phi^{2}}{2e}-O(\epsilon))-approximation algorithm for 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}.

Conclusion

We have presented novel approximation algorithms for two power scheduling problems with the sum-of-squares constraints, one of them significantly improved upon the previous known results. Our work also leads to several open problems for future work. Perhaps the most interesting question is whether our PTAS for 𝖬𝗂𝗇𝖢𝖪𝖯\mathsf{MinCKP} can be extended to the case when the constraint involves any constant number of squares. One can observe that our approach to solving the natural (non-convex) relaxation cannot be extended in this setting, and thus, the development of new techniques may be required. For the submodular maximization problem 𝖬𝖺𝗑𝖲𝖴𝖡\mathsf{MaxSUB}, it would be interesting to investigate the following two questions: i) Is there a better approximation algorithm for solving the relaxation? and ii) Is it possible to have pipage rounding that works for more general (differential) DR-submodular functions? For the first question, one idea is to utilize the special structure of the sum-of-square constraints, for which the semi-definite program-based method might be applied. Finally, extending all the obtained results in this paper to a setting with more than one constraint would also be another promising direction for research.

References

  • Bansal, Kimbrel, and Pruhs (2004) Bansal, N.; Kimbrel, T.; and Pruhs, K. 2004. Dynamic Speed Scaling to Manage Energy and Temperature. In 45th Symposium on Foundations of Computer Science (FOCS 2004), 17-19 October 2004, Rome, Italy, Proceedings, 520–529. IEEE Computer Society.
  • Bian et al. (2017a) Bian, A.; Levy, K. Y.; Krause, A.; and Buhmann, J. M. 2017a. Non-monotone Continuous DR-submodular Maximization: Structure and Algorithms. In Guyon, I.; von Luxburg, U.; Bengio, S.; Wallach, H. M.; Fergus, R.; Vishwanathan, S. V. N.; and Garnett, R., eds., Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, Long Beach, CA, USA, 486–496.
  • Bian et al. (2017b) Bian, A. A.; Mirzasoleiman, B.; Buhmann, J. M.; and Krause, A. 2017b. Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains. In Singh, A.; and Zhu, X. J., eds., Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, Fort Lauderdale, FL, USA, volume 54 of Proceedings of Machine Learning Research, 111–120. PMLR.
  • Chau, Elbassioni, and Khonji (2016) Chau, C.; Elbassioni, K. M.; and Khonji, M. 2016. Truthful Mechanisms for Combinatorial Allocation of Electric Power in Alternating Current Electric Systems for Smart Grid. ACM Trans. Economics and Comput., 5(1): 7:1–7:29.
  • Elbassioni, Karapetyan, and Nguyen (2019) Elbassioni, K. M.; Karapetyan, A.; and Nguyen, T. T. 2019. Approximation schemes for r-weighted Minimization Knapsack problems. Ann. Oper. Res., 279(1-2): 367–386.
  • Elbassioni and Nguyen (2017) Elbassioni, K. M.; and Nguyen, T. T. 2017. Approximation algorithms for binary packing problems with quadratic constraints of low cp-rank decompositions. Discret. Appl. Math., 230: 56–70.
  • Frieze and Clarke (1984) Frieze, A.; and Clarke, M. 1984. Approximation Algorithm for the m-Dimensional 0-1 Knapsack Problem. European Journal of Operational Research, 15: 100–109.
  • Grainger and Stevenson (1994) Grainger, J.; and Stevenson, W. 1994. Power System Analysis. McGraw-Hill.
  • Karapetyan et al. (2018) Karapetyan, A.; Khonji, M.; Chau, C.; Elbassioni, K. M.; and Zeineldin, H. H. 2018. Efficient Algorithm for Scalable Event-Based Demand Response Management in Microgrids. IEEE Trans. Smart Grid, 9(4): 2714–2725.
  • Khonji et al. (2019) Khonji, M.; Karapetyan, A.; Elbassioni, K. M.; and Chau, S. C. 2019. Complex-demand scheduling problem with application in smart grid. Theor. Comput. Sci., 761: 34–50.
  • Klimm et al. (2021) Klimm, M.; Pfetsch, M.; Raber, R.; and Skutella, M. 2021. Approximation of Binary Second Order Cone Programs of Packing Type.
  • Klimm et al. (2022) Klimm, M.; Pfetsch, M. E.; Raber, R.; and Skutella, M. 2022. Packing under convex quadratic constraints. Math. Program., 192(1): 361–386.
  • Weymouth (1912) Weymouth, T. 1912. Problems in natural gas engineering. Trans. Am. Soc. Mech. Eng., 34: 185–231.
  • Wierman, Andrew, and Tang (2012) Wierman, A.; Andrew, L. L. H.; and Tang, A. 2012. Power-Aware Speed Scaling in Processor Sharing Systems: Optimality and Robustness. Perform. Eval., 69(12): 601–622.
  • Woeginger (2000) Woeginger, G. J. 2000. When Does a Dynamic Programming Formulation Guarantee the Existence of a Fully Polynomial Time Approximation Scheme (FPTAS)? INFORMS J. Comput., 12(1): 57–74.
  • Wood and Wollenberg (2012) Wood, A. J.; and Wollenberg, B. F. 2012. Power generation, operation, and control. Wiley.
  • Yu and Chau (2013) Yu, L.; and Chau, C. 2013. Complex-demand knapsack problems and incentives in AC power systems. In Gini, M. L.; Shehory, O.; Ito, T.; and Jonker, C. M., eds., International conference on Autonomous Agents and Multi-Agent Systems, AAMAS ’13, Saint Paul, MN, USA, May 6-10, 2013, 973–980. IFAAMAS.