This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Kyushu University 11email: [email protected],[email protected],[email protected]22institutetext: NEC Corporation / RIKEN (His current affiliation is The University of Tokyo / RIKEN.) 22email: [email protected]33institutetext: NEC Corporation 33email: [email protected]

Online L\mathrm{L}^{\natural}-Convex Minimization

Ken Yokoyama 11    Shinji Ito 22    Tatsuya Matsuoka 33    Kei Kimura 11    Makoto Yokoo 11
Abstract

An online decision-making problem is a learning problem in which a player repeatedly makes decisions in order to minimize the long-term loss. These problems that emerge in applications often have nonlinear combinatorial objective functions, and developing algorithms for such problems has attracted considerable attention. An existing general framework for dealing with such objective functions is the online submodular minimization. However, practical problems are often out of the scope of this framework, since the domain of a submodular function is limited to a subset of the unit hypercube. To manage this limitation of the existing framework, we in this paper introduce the online L\mathrm{L}^{\natural}-convex minimization, where an L\mathrm{L}^{\natural}-convex function generalizes a submodular function so that the domain is a subset of the integer lattice. We propose computationally efficient algorithms for the online L\mathrm{L}^{\natural}-convex function minimization in two major settings: the full information and the bandit settings. We analyze the regrets of these algorithms and show in particular that our algorithm for the full information setting obtains a tight regret bound up to a constant factor. We also demonstrate several motivating examples that illustrate the usefulness of the online L\mathrm{L}^{\natural}-convex minimization.

Keywords:
Online optimization L\mathrm{L}^{\natural}-convex functions Discrete convex analysis

1 Introduction

Online decision-making is a learning problem in which a player repeatedly chooses decisions and makes predictions for loss to minimize long-term loss. These problems that appear as decision-making problems often have nonlinear and combinatorial (i.e., discrete) functions as the objective functions [23, 21, 19]. Designing computationally efficient algorithms with low regret for such problems is challenging.

Problems with nonlinear combinatorial objective functions appear, for example, in price prediction optimization. Price prediction optimization maximizes profits by predicting the demand for an unknown demand distribution and determining the prices of multiple items. In practical applications, pricing is often discrete, such as offering discounts of 5% or 10%, and demand functions for pricing tend to be nonlinear.

An existing framework for dealing with such problems is online submodular minimization, where the objective function is submodular. Submodularity, also known as the law of diminishing marginal utility, appears frequently in various fields such as economics, machine learning, and operations research.

Although online submodular minimization has diverse applications, its scope is also limited since the domain of the submodular function is a distributive lattice, which can be identified with a subset of the unit hypercube {0,1}d\{0,1\}^{d}. In fact, in the above example, price prediction optimization cannot be cast as an online submodular minimization if there are more than three types of items.

To overcome this limitation, we introduce online L\mathrm{L}^{\natural}-convex function minimization, where the minimizing objective function is an L\mathrm{L}^{\natural}-convex function. An L\mathrm{L}^{\natural}-convex function is a generalization of a submodular function whose domain is a subset of the integer lattice (i.e., d\mathbb{Z}^{d}) rather than {0,1}d\{0,1\}^{d}. Therefore, online L\mathrm{L}^{\natural}-convex function minimization can capture problems that are out of the scope of online submodular minimization. Moreover, L\mathrm{L}^{\natural}-convex functions are known to be transformed to multimodular functions via an unimodular transformation, and such multimodular functions appear in queueing theory, Malkov decision processes, and discrete event systems [2, 1, 9]. Hence, online L\mathrm{L}^{\natural}-convex function minimization also captures online multimodular function minimization.

1.1 Contributions

In this paper, we propose algorithms for online L\mathrm{L}^{\natural}-convex function minimization in two settings commonly addressed in previous studies. The first one is the full information setting, where, after making a decision, the player has access to all information relevant to that decision. The second one is the bandit setting, in which the player receives feedback only on the results of selected actions and cannot know the results of unselected actions.

We evaluate our algorithms in terms of regret, which is common in online decision making. Regret RTR_{T} is the difference between the sum of losses up to period TT in each iteration and the sum of losses for the fixed choice that is optimal in hindsight. See Equation 3 in Section 2 for a formal definition of the regret.

Notation needed for explaining the contributions in this paper is listed in Table 1. The contributions of our study can be summarized as follows.

Table 1: Notation
Parameter Meaning
d+d\in\mathbb{Z}_{+} Dimension of the decision space
[d][d] {1,2,,d}\{1,2,\dots,d\}
T+T\in\mathbb{Z}_{+} Time horizon
γˇ,γ^:[d]\check{\gamma},\hat{\gamma}:[d]\rightarrow\mathbb{Z} Lower and upper bounds of the decision space
𝒦{zdγˇ(i)ziγ^(i)(i[d])}\mathcal{K}\subseteq\{z\in\mathbb{Z}^{d}\mid\check{\gamma}(i)\leq z_{i}\leq\hat{\gamma}(i)~{}(\forall i\in[d])\} Bounded decision space that is L\mathrm{L}^{\natural}-convex set
ft:𝒦[M,M]f_{t}:\mathcal{K}\rightarrow[-M,M] L\mathrm{L}^{\natural}-convex cost function at time t[T]t\in[T]
NN maxi[d]{γ^(i)γˇ(i)}\max_{i\in[d]}\{\hat{\gamma}(i)-\check{\gamma}(i)\}
L^\hat{L}\in\mathbb{R} ll_{\infty}-Lipschitz constant of ft(t[T])f_{t}~{}(\forall t\in[T])
  • In the full information setting of online L-convex function minimization, we propose a computationally efficient randomized algorithm that achieves the following regret bound:

    𝐄[RT]=O(L^NdT).\mathbf{E}[R_{T}]=O(\hat{L}N\sqrt{dT}).
  • In the bandit setting of online L-convex function minimization, we propose a computationally efficient randomized algorithm that achieves the following regret bound:

    𝐄[RT]=O(MdNT2/3).\mathbf{E}[R_{T}]=O(MdNT^{2/3}).
  • In the Online L-convex function minimization, for any algorithm, there is a sequence of L-convex cost functions such that the algorithm has regret at least Ω(L^NdT)\Omega(\hat{L}N\sqrt{dT}). Therefore, our proposed algorithm for the full information setting achieves the best regret bound up to a constant factor.

  • We also present an example of problems that can be naturally formulated as online L\mathrm{L}^{\natural}-convex minimization in Section 5.

Table 2 summerizes regret bounds on the models of online submodular minimization and online L\mathrm{L}^{\natural}-convex minimization.

Table 2: Regret bounds in online submodular and L\mathrm{L}^{\natural}-convex minimization
Model
Upper bound
(full-information)
Upper bound
(bandit)
Lower bound
Online submodular minimization O(L^dT)O(\hat{L}\sqrt{dT}) O(MdT2/3)O(MdT^{2/3}) Ω(L^dT)\Omega(\hat{L}\sqrt{dT})
Online L\mathrm{L}^{\natural}-convex minimization O(L^NdT)O(\hat{L}N\sqrt{dT}) O(MdNT2/3)O(MdNT^{2/3}) Ω(L^NdT)\Omega(\hat{L}N\sqrt{dT})

1.2 Related Work

L\mathrm{L}^{\natural}-convex functions are central functions in discrete convex analysis, which aims to establish a general framework for minimization of discrete functions (i.e., functions defined on the integer lattice) by means of a combination of the ideas in continuous and discrete optimization. As mentioned above, L\mathrm{L}^{\natural}-convex functions generalize submodular functions and can formulate various problems in diverse fields such as operations research [7, 8], economics [18], and computer vision [22]. A combination of L\mathrm{L}^{\natural}-convex functions and machine learning have been seen in, e.g, [24, 20]. An efficient algorithm has been proposed to minimize L\mathrm{L}^{\natural}-convex functions [16], however devising an algorithm for online L\mathrm{L}^{\natural}-convex minimization requires a careful combination of online optimization and discrete convex analysis and thus it is a nontrivial task.

Compared with the number of online decision-making problems on continuous domains, the number of those on discrete domains are relatively small. A submodular function is a discrete function which appears in a variety of applications in the field matching online optimization such as price optimization and thus online submodular optimization is a well-studied topic. Note that submodular functions are special cases of L\mathrm{L}^{\natural}-convex functions with the domain restricted to {0,1}d\{0,1\}^{d}. For online submodular minimization, Hazan and Kale [12] obtained a tight regret bound for the full information setting, while Bubeck et al. [4] obtained that for the bandit setting for general online convex function minimizatin and thus for online submodular minimization (through a convex extension). Chen et al. [6] gave an algorithm for online continuous submodular maximization and demonstrate its performance in experiments.

We should note that our techniques for online L\mathrm{L}^{\natural}-convex minimization resemble those for stochastic L\mathrm{L}^{\natural}-convex minimization [24], however problem settings are different in that they aim to obtain PAC grarantees in stochastic models.

2 Preliminaries and Problem Statement

In this section, we present the fundamental properties of L-convex functions, alongside the concepts of prediction and online convex optimization, before detailing the specific problem setting.

We suppose that the decision space of a player is an L\mathrm{L}^{\natural}-convex set.

Definition 1 (L\mathrm{L}^{\natural}-convex set [11])

A set 𝒦d\mathcal{K}\subseteq\mathbb{Z}^{d} is an L\mathrm{L}^{\natural}-convex set if it satisfies

p,q𝒦p+q2,p+q2𝒦.p,q\in\mathcal{K}\Longrightarrow\left\lceil\frac{p+q}{2}\right\rceil,\left\lfloor\frac{p+q}{2}\right\rfloor\in\mathcal{K}. (1)

Hereafter, 𝒦\mathcal{K} means an L\mathrm{L}^{\natural}-convex set. We assume that 𝒦\mathcal{K} is bounded throughout the paper. We also assume, without loss of generality, that 𝒦\mathcal{K} is full-dimensional. We further assume that cost functions are L-convex functions defined as follows:

Definition 2 (L\mathrm{L}^{\natural}-convex function [11])

A function f:𝒦f:\mathcal{K}\rightarrow\mathbb{R} is called an L-convex function if it satisfies the discrete midpoint convexity:

f(x)+f(y)f(x+y2)+f(x+y2),x,y𝒦,f(x)+f(y)\geq f\left(\left\lceil\frac{x+y}{2}\right\rceil\right)+f\left(\left\lfloor\frac{x+y}{2}\right\rfloor\right),\quad\forall x,y\in\mathcal{K}, (2)

where \lceil\cdot\rceil, \lfloor\cdot\rfloor are the ceiling and flooring functions applied componentwisely to vectors.

In our online decision-making problem, the goal of the player is to minimize the cumulative cost t=1Tft(zt)\sum_{t=1}^{T}f_{t}(z_{t}). The preformance of the player is evaluated by means of the regret RTR_{T} defined as

RT=t=1Tft(zt)minz𝒦t=1Tft(z).R_{T}=\sum_{t=1}^{T}f_{t}(z_{t})-\min_{z^{*}\in\mathcal{K}}\sum_{t=1}^{T}f_{t}(z^{*}). (3)

2.1 Online L-Convex Minimization

In the online L\mathrm{L}^{\natural}-convex minimization, across iterations t=1,2,,Tt=1,2,\dots,T, an online decision maker is tasked with consistently determining the point zt𝒦z_{t}\in\mathcal{K}. Following the selection of ztz_{t} in each iteration, feedback is provided in the form of the cost ft(zt)f_{t}(z_{t}), where ft:𝒦f_{t}:\mathcal{K}\rightarrow\mathbb{R} is an L\mathrm{L}^{\natural}-convex function.

In this paper, we consider two distinct problem settings characterized by different levels of information feedback:

  • In the full information setting, at each round tt, the player has comprehensive access to the sequence of past functions f1,f2,,ft1f_{1},f_{2},\dots,f_{t-1} applicable to any input.

  • In the bandit setting, at each round tt, the player is restricted to observing the values of the functions f1(z1),f2(z2),,ft1(zt1)f_{1}(z_{1}),f_{2}(z_{2}),\dots,f_{t-1}(z_{t-1}) corresponding to one’s previous selections zj(j=1,2,,t1)z_{j}~{}(j=1,2,\dots,t-1).

2.2 Lovász Extension of Submodular Functions

We introduce the Lovász extension on submodular functions, a key algorithmic tool. We leverage the fact that the restriction of an L\mathrm{L}^{\natural}-convex function on an arbitary unit hypercube z+{0,1}d(zd)z+\{0,1\}^{d}~{}(z\in\mathbb{Z}^{d}), results in a submodular function [16]. Further, for any L\mathrm{L}^{\natural}-convex set 𝒦\mathcal{K} and z𝒦z\in\mathcal{K}, 𝒦(z+{0,1}d)\mathcal{K}\cap(z+\{0,1\}^{d}) is a distributive lattice, where a submodular function can be defined. To define a convex extension of L\mathrm{L}^{\natural}-convex set and function, we define the Lovász extension of a submodular function on distributive lattice. Without loss of generality we assume that distribute lattices appear in this paper are simple.

Definition 3 (simple distributive lattice)

D{0,1}dD\subseteq\{0,1\}^{d} is said to be a simple distributive lattice if it satisfied the following a condition:

there exists a poset𝒫=([d],)such thatD={I[d]I is a lower ideal of 𝒫}.\text{there exists a poset}~{}\mathcal{P}=([d],\preceq)~{}\text{such that}~{}D=\{I\subseteq[d]\mid\text{I is a lower ideal of }\mathcal{P}\}. (4)

This definition follows from Theorem 3.9 in [10]. Let D¯\overline{D} denote the convex hull of DD. Before we proceed with the formal definition of the Lovász extension, we introduce the concept of a chain.

Definition 4 (chain)

A chain is a collection of subsets A0,A1,,ApA_{0},A_{1},\dots,A_{p} of [d][d] such that

A0A1A2Ap.A_{0}\subsetneq A_{1}\subsetneq A_{2}\subsetneq\dots\subsetneq A_{p}.

For any xD¯x\in\overline{D}, there exists a unique chain that can represent xx as a convex combination:

x=i=0pμiχAix=\sum_{i=0}^{p}\mu_{i}\chi_{A_{i}} (5)

where μi>0,i=0pμi=1,χAiD\mu_{i}>0,~{}\sum_{i=0}^{p}\mu_{i}=1,~{}\chi_{A_{i}}\in D is the characteristic vector of Ai[d](i=0,,d)A_{i}\subseteq[d]~{}(i=0,\dots,d) [10].

It is known that any maximal chain such that all the characteristic vectors of the elements in the chain are in DD has length d+1d+1 when DD is a simple distributive lattice. Moreover, such maximal chains have one-to-one correspondence to the set of total orders obtainable by topologically sorting the elements of poset 𝒫=([d],)\mathcal{P}=([d],\preceq) representing DD as in Definition 3.

We define the Lovász extension on a simple distributive lattice.

Definition 5 (Lovász extension on a simple distributive lattice [10])

For a submodular function on DD, its Lovasz extension f^\hat{f} on D¯\overline{D} is defined as follows. Let xD¯x\in\overline{D}, and let A0A1A2ApA_{0}\subsetneq A_{1}\subsetneq A_{2}\subsetneq\dots\subsetneq A_{p} be the unique chain such that x=i=0pμiχAi,μi>0,i=0pμi=1x=\sum_{i=0}^{p}\mu_{i}\chi_{A_{i}},\,\mu_{i}>0,~{}\sum_{i=0}^{p}\mu_{i}=1. Then the value of the Lovász extension f^\hat{f} at xx is defined to be

f^(x):=i=0pμif(χAi).\hat{f}(x):=\sum_{i=0}^{p}\mu_{i}f(\chi_{A_{i}}). (6)

We say a maximal chain is associated with xx if (i) all of its elements are contained in DD and (ii) it contains the unique maximal chain in Definition 5 as a subchain. We highlight the properties of the Lovász extension and submodular functions that are particularly critical, as outlined below.

Lemma 1 (Properties of the Lovász extension [3, 10])

Let f:Df:D\rightarrow\mathbb{R} be a submodular function with DD being a simple distributive lattice and let f^:D¯\hat{f}:\overline{D}\rightarrow\mathbb{R} be the Lovász extention of ff. The following properties hold for f^\hat{f}:

  • f^\hat{f} is a convex function.

  • For any xDx\in D, it holds that f^(x)=f(x)\hat{f}(x)=f(x).

  • For xD¯x\in\overline{D}, choosing a threshold τ[0,1]\tau\in[0,1] uniformly at random and defining the level set Sτ={i:xi>τ}S_{\tau}=\{i:x_{i}>\tau\}, we obtain f^(x)=𝐄τ[f(χSτ)]\hat{f}(x)=\mathbf{E}_{\tau}[f(\chi_{S_{\tau}})].

  • For xD¯x\in\overline{D}, and an arbitrary maximal chain =A0A1A2Ad=[d]\emptyset=A_{0}\subsetneq A_{1}\subsetneq A_{2}\subsetneq\dots\subsetneq A_{d}=[d] associated with xx, a subgradient gg of f^\hat{f} at xx is given by:

    g(AiAi1)=f(Ai)f(Ai1).g(A_{i}-A_{i-1})=f(A_{i})-f(A_{i-1}). (7)

2.3 Convex Extension of L\mathrm{L}^{\natural}-Convex Functions

We introduce a convex extension of an L\mathrm{L}^{\natural}-convex function by piecing together the Lovász extension of submodular functions introduced in the previous subsection. As a preparation, we introduce a maximal chain associated with x𝒦¯x\in\overline{\mathcal{K}}.

Definition 6 (maximal chain associated with xx)

Let x𝒦¯x\in\overline{\mathcal{K}}. We define =A0A1Ad=[d]\emptyset=A_{0}\subsetneq A_{1}\subsetneq\dots\subsetneq A_{d}=[d] to be a maximal chain associated with xx if it satisfies:

z¯𝒦,z¯xz¯+𝟏,z+χAi𝒦(i{0,1,,d}),\displaystyle\exists\underline{z}\in\mathcal{K},\underline{z}\leq x\leq\underline{z}+\boldsymbol{1},z+\chi_{A_{i}}\in\mathcal{K}~{}(\forall i\in\{0,1,\dots,d\}),
μi0(i{0,1,,d}),x=z¯+i=0dμiχAi,i=0dμi=1.\displaystyle\exists\mu_{i}\geq 0~{}(\forall i\in\{0,1,\dots,d\}),x=\underline{z}+\sum_{i=0}^{d}\mu_{i}\chi_{A_{i}},\sum_{i=0}^{d}\mu_{i}=1.

We define a convex extension of L\mathrm{L}^{\natural}-convex functions.

Definition 7 (convex extension of L\mathrm{L}^{\natural}-convex functions [10])

Let f:𝒦f:\mathcal{K}\rightarrow\mathbb{R} is an L\mathrm{L}^{\natural}-convex function, and let x𝒦¯x\in\overline{\mathcal{K}}. We define a convex extension f^:𝒦¯\hat{f}:\overline{\mathcal{K}}\rightarrow\mathbb{R} of ff as follows. Let =A0A1Ad=[d],z¯𝒦\emptyset=A_{0}\subsetneq A_{1}\subsetneq\dots\subsetneq A_{d}=[d],\underline{z}\in\mathcal{K} be a maximal chain associated with xx, and z¯\underline{z} and μi\mu_{i} be those appearing in Definition 6. For xK¯x\in\overline{K}, we define the convex extension f^\hat{f} of L\mathrm{L}^{\natural}-convex functions ff as follows:

f^(x):=i=0kμif(z¯+χAi).\hat{f}(x):=\sum_{i=0}^{k}\mu_{i}f(\underline{z}+\chi_{A_{i}}). (8)

This convex extension f^\hat{f} is piecewise linear by definition, and the subgradient of f^\hat{f} at x𝒦¯x\in\overline{\mathcal{K}} can be easily computed by the chain.

Next, we introduce an important property of f^\hat{f} defined by Equation 8.

Lemma 2 ([10])

A function f:𝒦f:\mathcal{K}\rightarrow\mathbb{R} is an L\mathrm{L}^{\natural}-convex function if and only if its convex extension f^\hat{f} defined in Definition 7 is a convex function.

In addition, as with the case of submodular function, the following is known.

Lemma 3

Let f:𝒦f:\mathcal{K}\rightarrow\mathbb{R} be an L\mathrm{L}^{\natural}-convex function and f^\hat{f} be the convex extension of ff defined in Definition 7. Let x𝒦¯x\in\overline{\mathcal{K}}, and let =A0A1Ad=[d],z¯𝒦\emptyset=A_{0}\subsetneq A_{1}\subsetneq\dots\subsetneq A_{d}=[d],\underline{z}\in\mathcal{K} be a maximal chain associated with xx, and z¯\underline{z} and μi\mu_{i} be those appearing in Definition 6. Choosing a threshold τ[0,1]\tau\in[0,1] uniformly at random and defining the level set Sτ={i:xi>z¯i+τ}S_{\tau}=\{i:x_{i}>\underline{z}_{i}+\tau\}, we obtain f^(x)=𝐄τ[f(z¯+χSτ)]\hat{f}(x)=\mathbf{E}_{\tau}[f(\underline{z}+\chi_{S_{\tau}})].

Proof.

First, the restriction of an L\mathrm{L}^{\natural}-convex function on each hypercube D(z+{0,1}d)(zd)D\cap(z+\{0,1\}^{d})~{}(z\in\mathbb{Z}^{d}) is a submodular function [16]. Next, it is shown, following Lemma 1, that the Lovász extension on a submodular function coincides with the expected value under a uniform distribution. Therefore, it holds that f^(x)=𝐄τ[f(z¯+χSτ)]\hat{f}(x)=\mathbf{E}_{\tau}[f(\underline{z}+\chi_{S_{\tau}})].∎∎

2.4 Subgradient of the Convex Extension of L\mathrm{L}^{\natural}-Convex Function

We have defined the convex extension f^:𝒦¯\hat{f}:\overline{\mathcal{K}}\rightarrow\mathbb{R} of an L\mathrm{L}^{\natural}-convex function f:𝒦f:\mathcal{K}\rightarrow\mathbb{R} in the previous subsection. Our algorithms in Section 3 rely on the fact that the subgradient gg of f^\hat{f} at a given point x𝒦¯x\in\overline{\mathcal{K}} can be computed efficiently. Here, we explain how this can be done.

Recall that the convex extension f^\hat{f} of an L\mathrm{L}^{\natural}-convex function ff is constructed by piecing together the Lovász extension of the submodular function restricted to each unit hypercube z+{0,1}dz+\{0,1\}^{d} (zdz\in\mathbb{Z}^{d}). Since we can compute the subgradient for submodular functions, it suffices to show that, given a point x𝒦¯x\in\overline{\mathcal{K}}, we can compute a maximal chain associated with xx (see Definition 6). This can be easily done when each xix_{i}\notin\mathbb{Z}, since we can set z=xz=\lfloor x\rfloor and express xx as a convex combination of the characteristic vectors corresponding to a maximal chain of z+{0,1}dz+\{0,1\}^{d}. Hence, the challenge here is to compute such a maximal chain when xix_{i}\in\mathbb{Z} for some i[d]i\in[d] (consider the case when 𝒦={y2y1,y22}\mathcal{K}=\{y\in\mathbb{Z}^{2}\mid y_{1},y_{2}\leq 2\} and x=(2,2)x=(2,2), where we cannot set z=xz=\lfloor x\rfloor), and we show how to do this in the following.

We use the fact that the domain 𝒦\mathcal{K} is an L\mathrm{L}^{\natural}-convex set that can be expressed by a system of linear inequalities using γˇ,γ^:[d]\check{\gamma},\hat{\gamma}:[d]\to\mathbb{Z} and γ:[d]2\gamma:[d]^{2}\to\mathbb{Z} as follows (see, e.g., [17]):

𝒦=P(γ,γˇ,γ^)d,\displaystyle\mathcal{K}=P(\gamma,\check{\gamma},\hat{\gamma})\cap\mathbb{Z}^{d}, (9)

where

P(γ,γˇ,γ^):={xdγˇixiγ^i(i[d]),xixjγi,j(i,j[d])}.\displaystyle P(\gamma,\check{\gamma},\hat{\gamma}):=\{x\in\mathbb{R}^{d}\mid\check{\gamma}_{i}\leq x_{i}\leq\hat{\gamma}_{i}~{}(\forall i\in[d]),x_{i}-x_{j}\leq\gamma_{i,j}(\forall i,j\in[d])\}. (10)

It is also known that 𝒦¯=P(γ,γˇ,γ^)\overline{\mathcal{K}}=P(\gamma,\check{\gamma},\hat{\gamma}). We assume that the expression (9) is explicitly given.

As in the case of submodular functions, we assume without loss of generality that 𝒦¯\overline{\mathcal{K}} is full-dimensional (this corresponds to that the domain of a submodular function is a simple distributive lattice).

As noted above, we would like to compute a maximal chain associated with xx for given x𝒦¯x\in\overline{\mathcal{K}}. As shown in Subsection 2.3, a maximal chain can be found when (z+[0,1]d)𝒦¯(z+[0,1]^{d})\cap\overline{\mathcal{K}} is full-dimensional. Hence, above computation can be reduced to the following problem: given x𝒦¯x\in\overline{\mathcal{K}}, find z¯d\underline{z}\in\mathbb{Z}^{d} such that (z¯+[0,1]d)𝒦¯(\underline{z}+[0,1]^{d})\cap\overline{\mathcal{K}} is full-dimensional, i.e., a simple distributive lattice. This problem can be solved by the following procedure.

Algorithm 1 Compute a maximal chain in an L\mathrm{L}^{\natural}-convex set
L\mathrm{L}^{\natural}-convex set 𝒦\mathcal{K} and x𝒦¯x\in\overline{\mathcal{K}}
D0𝒦¯D_{0}\leftarrow\overline{\mathcal{K}}.
for i=1,2,,di=1,2,\dots,d do
     if xix_{i}\notin\mathbb{Z} then
         z¯ixi\underline{z}_{i}\leftarrow\lfloor x_{i}\rfloor.
     else
         δimaxyDiyi\delta_{i}\leftarrow\max_{y\in D_{i}}y_{i}.
         if δi=xi\delta_{i}=x_{i} then
              z¯ixi1\underline{z}_{i}\leftarrow x_{i}-1.
         else
              z¯ixi\underline{z}_{i}\leftarrow x_{i}.
         end if
     end if
     DiDi1{ydz¯iyiz¯i+1}D_{i}\leftarrow D_{i-1}\cap\{y\in\mathbb{R}^{d}\mid\underline{z}_{i}\leq y_{i}\leq\underline{z}_{i}+1\}.
end for
fz¯f|𝒦(z¯+[0,1]d)f_{\underline{z}}\leftarrow f|_{\mathcal{K}\cap(\underline{z}+[0,1]^{d})}
/*Compute the poset representation ([d],R)([d],R) of dom(fz¯){\rm dom}\,(f_{\underline{z}}) as follows.*/
RR\leftarrow\emptyset (as a binary relation).
for each i,j[d]i,j\in[d] do
     δi,j:=maxy𝒦¯yiyj\delta_{i,j}:=\max_{y\in\overline{\mathcal{K}}}y_{i}-y_{j}.
     if  z¯iz¯j=δi,j\underline{z}_{i}-\underline{z}_{j}=\delta_{i,j} then
         RR{(i,j)}R\leftarrow R\cup\{(i,j)\}.
     end if
end for
Topologically sort ([d],R)([d],R) and compute a permutation π:[d][d]\pi:[d]\to[d] corresponding to this sorting.
return z¯\underline{z} and a maximal chain =A0A1Ad=[d]\emptyset=A_{0}\subsetneq A_{1}\subsetneq\dots\subsetneq A_{d}=[d], where Ai{π(1),π(2),,π(i)}fori=1,2,,dA_{i}\leftarrow\{\pi(1),\pi(2),\dots,\pi(i)\}~{}\text{for}~{}i=1,2,\dots,d.

Now, we show the correctness of the above procedure, We first claim that each DiD_{i} (i=0,1,2,,di=0,1,2,\dots,d) is full-dimensional. Since Dd=(z¯+[0,1]d)𝒦¯D_{d}=(\underline{z}+[0,1]^{d})\cap\overline{\mathcal{K}}, (z¯+[0,1]d)𝒦¯(\underline{z}+[0,1]^{d})\cap\overline{\mathcal{K}} is full-dimensional as desired. To show this claim, we need the following auxiliary lemma. For cdc\in\mathbb{R}^{d} and r>0r>0, let B(c,r)dB(c,r)\subseteq\mathbb{R}^{d} be the open ball with center cc and radius rr, i.e., B(c,r)={ydyc2<r}B(c,r)=\{y\in\mathbb{R}^{d}\mid||y-c||_{2}<r\}. Note that a subset of d\mathbb{R}^{d} is full-dimensional if and only if it includes an open ball (which in turn is equivalent to that it has a positive volume).

Lemma 4

Let PdP\subseteq\mathbb{R}^{d} be a full-dimensional convex set. Then for any xPx\in P and any ε>0\varepsilon>0, vol(PB(x,ε))>0{\rm vol}(P\cap B(x,\varepsilon))>0.

Proof.

Since PP is full-dimensional, there exists cPc\in P and ε>0\varepsilon^{\prime}>0 such that B(c,ε)dB(c,\varepsilon^{\prime})\subseteq\mathbb{R}^{d}. For any λ(0,1)\lambda\in(0,1), we have Bλ:=(1λ)x+λB(c,ε)PB_{\lambda}:=(1-\lambda)x+\lambda B(c,\varepsilon^{\prime})\subseteq P, since PP is convex. Moreover, for sufficiently small λ>0\lambda>0, we have BλB(x,ε)B_{\lambda}\subseteq B(x,\varepsilon). (Concretely, λ=ε/(cx2+ε2)\lambda=\varepsilon/(||c-x||_{2}+||\varepsilon^{\prime}||_{2}) implies BλB(x,ε)B_{\lambda}\subseteq B(x,\varepsilon).) Hence, BλPB(x,ε)B_{\lambda}\subseteq P\cap B(x,\varepsilon). As BλB_{\lambda} has a positive volume, so does PB(x,ε)P\cap B(x,\varepsilon). ∎∎

We are ready to prove that DiD_{i} is full-dimensional for each i=0,1,2,,di=0,1,2,\dots,d.

Lemma 5

DiD_{i} is full-dimensional for each i=0,1,2,,di=0,1,2,\dots,d.

Proof.

We show this by induction on ii. Since D0=𝒦¯D_{0}=\overline{\mathcal{K}}, the statement holds for i=0i=0 by our assumption on dom(f){\rm dom}\,(f). Let i>0i>0 and assume that Di1D_{i-1} is full-dimensional. We show that DiD_{i} is also full-dimensional in the following.

When xix_{i}\notin\mathbb{Z}, we know that B(x,ε)B(x,\varepsilon) is contained in {ydziyizi+1}\{y\in\mathbb{R}^{d}\mid z_{i}\leq y_{i}\leq z_{i}+1\} if ε=min(xx,x)\varepsilon=\min(x-\lfloor x\rfloor,\lceil x\rceil). Hence, DiB(x,ε)=Di1B(x,ε)D_{i}\cap B(x,\varepsilon)=D_{i-1}\cap B(x,\varepsilon). Di1B(x,ε)D_{i-1}\cap B(x,\varepsilon) has a positive volume from Lemma 4. Hence, so does DiB(x,ε)D_{i}\cap B(x,\varepsilon) and thus DiD_{i} is full-dimensional.

When xix_{i}\in\mathbb{Z}, we know that there exists x,x′′Di1x^{\prime},x^{\prime\prime}\in D_{i-1} such that xi=z¯ix^{\prime}_{i}=\underline{z}_{i} and xi′′=z¯i+1x^{\prime\prime}_{i}=\underline{z}_{i}+1 by full-dimensionality of Di1D_{i-1} and definition of z¯i\underline{z}_{i}. As Di1D_{i-1} is convex, we have y:=(x+x′′)/2Di1y:=(x^{\prime}+x^{\prime\prime})/2\in D_{i-1}. Since yi=zi+1/2y_{i}=z_{i}+1/2, DiB(y,1/2)=Di1B(y,1/2)D_{i}\cap B(y,1/2)=D_{i-1}\cap B(y,1/2). Di1B(y,1/2)D_{i-1}\cap B(y,1/2) has a positive volume from Lemma 4. Hence, so does DiB(y,1/2)D_{i}\cap B(y,1/2) and thus DiD_{i} is full-dimensional.∎∎

Since (z¯+[0,1]d)𝒦¯(\underline{z}+[0,1]^{d})\cap\overline{\mathcal{K}} is full-dimensional as shown above, the function fz¯:=f|𝒦(z¯+[0,1]d)f_{\underline{z}}:=f|_{\mathcal{K}\cap(\underline{z}+[0,1]^{d})} is a submodular function over a simple distributive lattice. Also, we can compute the poset representation of dom(fz¯)(=K(z+[0,1]d)){\rm dom}\,(f_{\underline{z}})(=K\cap(z+[0,1]^{d})) as follows. iji\preceq j if z¯iz¯j=δi,j:=maxy𝒦¯yiyj\underline{z}_{i}-\underline{z}_{j}=\delta_{i,j}:=\max_{y\in\overline{\mathcal{K}}}y_{i}-y_{j}. By topologically sorting this poset, we can compute a maximal chain in (z¯+{0,1}d)𝒦(\underline{z}+\{0,1\}^{d})\cap\mathcal{K} and thus compute the subgradient of f^\hat{f} at xx by (7) in Lemma 1. We summarize this as a proposition as follows.

Proposition 1

Let f:𝒦f:\mathcal{K}\rightarrow\mathbb{R} be an L\mathrm{L}^{\natural}-convex function. Suppose 𝒦\mathcal{K} is full-dimensional and the expression (9) of 𝒦\mathcal{K} is given. Then there exists a polynomial time algorithm that, given point x𝒦¯x\in\overline{\mathcal{K}}, computes a subgradient of the convex extension f^\hat{f} of ff at xx.

3 Upper Bound on Regret

3.1 Full Information Setting

In this section, we extend the online submodular minimization algorithm (Algorithm 2 from Hazan and Kale [12]) to L\mathrm{L}^{\natural}-convex functions in the full information setting. Subsequently, we derive an upper bound for the regret of the algorithm.

Let 𝒦\mathcal{K} be an arbitrary L\mathrm{L}^{\natural}-convex set, and let 𝒦¯\overline{\mathcal{K}} be the convex hull of 𝒦\mathcal{K}. As a preparation, define the convex projection Π𝒦¯:d𝒦¯\Pi_{\overline{\mathcal{K}}}:\mathbb{R}^{d}\rightarrow\overline{\mathcal{K}} onto the convex set 𝒦¯\overline{\mathcal{K}} as follows:

Π𝒦¯(y):=argminx𝒦¯xy.\Pi_{\overline{\mathcal{K}}}(y):=\underset{x\in\overline{\mathcal{K}}}{\text{argmin}}\|x-y\|. (11)

When 𝒦=[N]d\mathcal{K}=[N]^{d}, Π𝒦¯(y)\Pi_{\overline{\mathcal{K}}}(y) can be easily calculated as follows:

Π𝒦¯(y)(i)={Nify(i)>N1ify(i)<1y(i)otherwise(i=1,,d).\Pi_{\overline{\mathcal{K}}}(y)(i)=\left\{\begin{array}[]{ll}N{}&\text{if}\ y(i)>N\\ 1{}&\text{if}\ y(i)<1\\ y(i){}&\text{otherwise}\end{array}\right.~{}(i=1,\dots,d). (12)

We assume that this projection operation can be done efficiently, since it is a minimization of a convex function.

Next, define the rounding function 𝒫τ:dd\mathcal{P}_{\tau}:\mathbb{R}^{d}\rightarrow\mathbb{Z}^{d} using the threshold τ[0,1]\tau\in[0,1] as follows:

𝒫τ(x)(i):={x(i)ifx(i)>x(i)+τx(i)ifx(i)x(i)+τ(i=1,,d).\mathcal{P}_{\tau}(x)(i):=\left\{\begin{array}[]{ll}\lceil x(i)\rceil{}&if~{}x(i)>\lfloor x(i)\rfloor+\tau\\ \lfloor x(i)\rfloor{}&if~{}x(i)\leq\lfloor x(i)\rfloor+\tau\end{array}\right.~{}(i=1,\dots,d). (13)

With these preparatory definitions in place, we now detail the operational flow of the Algorithm 2 (L-convex Subgradient Descent).

At each iteration tt, a threshold τ\tau is chosen uniformly at random from the interval [0,1][0,1]. This threshold is then used to discretize the continuous decision variable xt𝒦¯x_{t}\in\overline{\mathcal{K}} into a discrete variable zt𝒦z_{t}\in\mathcal{K}. Subsequently, the cost associated with ztz_{t} is determined. Based on this cost, xtx_{t} is updated using the calculated subgradient. The algorithm repeats this process TT times. The process flow can be summarized as shown in Algorithm 2 below.

Algorithm 2 L-convex Subgradient Descent
1:parameter η>0\eta>0. Let x1𝒦¯x_{1}\in\overline{\mathcal{K}} be an arbitary initial point.
2:for t=1toTt=1~{}\text{to}~{}T do
3:     Choose a threshold τ[0,1]\tau\in[0,1] uniformly at random.
4:     Set zt=𝒫τ(xt)z_{t}=\mathcal{P}_{\tau}(x_{t}).
5:     Obtain cost ft(zt)f_{t}(z_{t}).
6:     Find a maximal chain associated with xtx_{t}, =A0A1Ad=[d]\emptyset=A_{0}\subsetneq A_{1}\subsetneq\dots\subsetneq A_{d}=[d] and z¯\underline{z} by Algorithm 1.
7:     Compute a subgradient gtg_{t} of the convex extension f^t\hat{f}_{t} at xtx_{t} using ft(z¯+χA0),ft(z¯+χA1),,ft(z¯+χAd)f_{t}(\underline{z}+\chi_{A_{0}}),f_{t}(\underline{z}+\chi_{A_{1}}),\dots,f_{t}(\underline{z}+\chi_{A_{d}}).
8:     Update: set xt+1=Π𝒦¯(xtηgt).x_{t+1}=\Pi_{\overline{\mathcal{K}}}(x_{t}-\eta g_{t}).
9:end for

Here, at each round tt, the most computationally expensive part is the element-by-element sorting required to find the Lovász extension, with a computational complexity of O(dlogd)O(d\log d). Therefore, the overall computational complexity is O(Tdlogd)O(Td\log d), which can be computed in polynomial time.

Next, in preparation for showing the regret upper bound of Algorithm 2, we introduce the Lemma 6. Lemma 6 states that the norm of the subgradient is bounded above by the Lipschitz constant.

Lemma 6 (Supplementary material of [24] Example 3)

Let ff be L\mathrm{L}^{\natural}-convex function, and has ll_{\infty}-Lipschitz constant L^\hat{L}. Let f^:𝒦¯\hat{f}:\overline{\mathcal{K}}\rightarrow\mathbb{R} be the convex extension of ff. For any x𝒦¯x\in\overline{\mathcal{K}}, subgradient gg of f^\hat{f} at xx computed using Algorithm 1 satisfies g132L^.\|g\|_{1}\leq\frac{3}{2}\hat{L}.

For regret analysis, we extend Lemma 11 in Hazan and Kale [12] over 𝒦\mathcal{K}.

Lemma 7

Let ft:𝒦(t=1,2,,T)f_{t}:\mathcal{K}\rightarrow\mathbb{R}~{}(t=1,2,\dots,T) be a sequence of L\mathrm{L}^{\natural}-convex functions. Let f^t:𝒦¯(t=1,2,,T)\hat{f}_{t}:\overline{\mathcal{K}}\rightarrow\mathbb{R}~{}(t=1,2,\dots,T) be the convex extension of ftf_{t}. Let xt𝒦¯(t=1,2,,T)x_{t}\in\overline{\mathcal{K}}~{}(t=1,2,\dots,T) be defined by x1=0x_{1}=0 and xt+1=Π𝒦¯(xtηg^t)x_{t+1}=\Pi_{\overline{\mathcal{K}}}(x_{t}-\eta\hat{g}_{t}), where g^1,g^2,,g^T\hat{g}_{1},\hat{g}_{2},\dots,\hat{g}_{T} are vector valued random variables such that 𝐄[g^t|xt]=gt\mathbf{E}[\hat{g}_{t}\,|\,x_{t}]=g_{t}, where gtg_{t} is a subgradient of f^t\hat{f}_{t} at xtx_{t}. Then the expected regret of playing x1,x2,,xTx_{1},x_{2},\dots,x_{T} is bounded as

t=1T𝐄[f^t(xt)]minx𝒦t=1Tf^t(x)dN22η+η2t=1T𝐄[gt2].\sum_{t=1}^{T}\mathbf{E}[\hat{f}_{t}(x_{t})]-\min_{x\in\mathcal{K}}\sum_{t=1}^{T}\hat{f}_{t}(x)\leq\frac{dN^{2}}{2\eta}+\frac{\eta}{2}\sum_{t=1}^{T}\mathbf{E}[\|g_{t}\|^{2}]. (14)

The proof is similar to Lemma 11 in Hazan and Kale [12].

Proof.

Let yt+1=xtηg^ty_{t+1}=x_{t}-\eta\hat{g}_{t}, so that xt+1=Π𝒦(yt+1)x_{t+1}=\Pi_{\mathcal{K}}(y_{t+1}). Note that

yt+1x2=xtx22ηg^(xtx)+η2g^t2.\|y_{t+1}-x^{*}\|^{2}=\|x_{t}-x^{*}\|^{2}-2\eta\hat{g}^{\top}(x_{t}-x^{*})+\eta^{2}\|\hat{g}_{t}\|^{2}. (15)

Rearranging Equation (15), we have

g^t(xtx)=12η(xtx2yt+1x2)+η2g^t2.\hat{g}_{t}^{\top}(x_{t}-x^{*})=\frac{1}{2\eta}(\|x_{t}-x^{*}\|^{2}-\|y_{t+1}-x^{*}\|^{2})+\frac{\eta}{2}\|\hat{g}_{t}\|^{2}. (16)

Utilizing the property that xt+1xyt+1x\|x_{t+1}-x^{*}\|\leq\|y_{t+1}-x^{*}\| (a consequence of the properties of Euclidean projections onto convex sets) leads to

g^t(xtx)12η(xtx2xt+1x2)+η2g^t2.\hat{g}_{t}^{\top}(x_{t}-x^{*})\leq\frac{1}{2\eta}(\|x_{t}-x^{*}\|^{2}-\|x_{t+1}-x^{*}\|^{2})+\frac{\eta}{2}\|\hat{g}_{t}\|^{2}. (17)

Aggregating the terms for t=1,2,,Tt=1,2,\dots,T, we have

t=1Tg^(xtx)\displaystyle\sum_{t=1}^{T}\hat{g}^{\top}(x_{t}-x^{*}) t=1T12η(xtx22xt+1x22)+η2g^t22\displaystyle\leq\sum_{t=1}^{T}\frac{1}{2\eta}(\|x_{t}-x^{*}\|_{2}^{2}-\|x_{t+1}-x^{*}\|_{2}^{2})+\frac{\eta}{2}\|\hat{g}_{t}\|_{2}^{2} (18)
=12η(x1x22xTx22)+t=1Tη2g^t22\displaystyle=\frac{1}{2\eta}(\|x_{1}-x^{*}\|_{2}^{2}-\|x_{T}-x^{*}\|_{2}^{2})+\sum_{t=1}^{T}\frac{\eta}{2}\|\hat{g}_{t}\|_{2}^{2} (19)
12η(x1x22)+t=1Tη2g^t22\displaystyle\leq\frac{1}{2\eta}(\|x_{1}-x^{*}\|_{2}^{2})+\sum_{t=1}^{T}\frac{\eta}{2}\|\hat{g}_{t}\|_{2}^{2} (20)
12η(dx1x2)+t=1Tη2g^t22\displaystyle\leq\frac{1}{2\eta}(d\|x_{1}-x^{*}\|_{\infty}^{2})+\sum_{t=1}^{T}\frac{\eta}{2}\|\hat{g}_{t}\|_{2}^{2} (21)
dN22η+t=1Tη2g^t22,\displaystyle\leq\frac{dN^{2}}{2\eta}+\sum_{t=1}^{T}\frac{\eta}{2}\|\hat{g}_{t}\|_{2}^{2}, (22)

where inequality (20) is derived from the relationship x2dx\|x\|_{2}\leq d\|x\|_{\infty}, and inequality (21) leverages the bound xy2N2\|x-y\|_{\infty}^{2}\leq N^{2} from the definition of NN. Next, given that 𝐄[g^t|xt]=gt\mathbf{E}[\hat{g}_{t}|x_{t}]=g_{t} (a subgradient of f^t\hat{f}_{t} at xtx_{t}), we obtain

𝐄[g^t(xtx)|xt]=gt(xtx)f^t(xt)f^t(x),\mathbf{E}[\hat{g}_{t}^{\top}(x_{t}-x^{*})|x_{t}]=g_{t}^{\top}(x_{t}-x^{*})\geq\hat{f}_{t}(x_{t})-\hat{f}_{t}(x^{*}), (23)

owing to the convexity of f^t\hat{f}_{t}. By taking the expectation over the selection of xtx_{t}, we derive

𝐄[g^t(xtx)]𝐄[f^t(xt)]f^t(x).\mathbf{E}[\hat{g}_{t}^{\top}(x_{t}-x^{*})]\geq\mathbf{E}[\hat{f}_{t}(x_{t})]-\hat{f}_{t}(x^{*}). (24)

Consequently, the expected regret can be bounded as follows:

t=1T𝐄[f^t(xt)]f^t(x)𝐄[t=1Tg^t(xtx)]dN22η+η2t=1T𝐄[g^t22].\sum_{t=1}^{T}\mathbf{E}[\hat{f}_{t}(x_{t})]-\hat{f}_{t}(x^{*})\leq\mathbf{E}\left[\sum_{t=1}^{T}\hat{g}_{t}^{\top}(x_{t}-x^{*})\right]\leq\frac{dN^{2}}{2\eta}+\frac{\eta}{2}\sum_{t=1}^{T}\mathbf{E}[\|\hat{g}_{t}\|_{2}^{2}]. (25)

∎∎

Building on the aforementioned results, we give the regret bounds in Theorem 3.1.

Theorem 3.1

When Algorithm 2 is executed with the parameter η=dN2T(32L^)2\eta=\sqrt{\frac{dN^{2}}{T(\frac{3}{2}\hat{L})^{2}}}, it achieves a regret bound of 𝐄[RT]34NL^dT.\mathbf{E}[R_{T}]\leq\frac{3}{4}N\hat{L}\sqrt{dT}.

It can be proved by using Lemmas 16, and 7.

Proof.

Using Lemmas 16, and 7, we derive

𝐄[RT]\displaystyle\mathbf{E}[R_{T}] =t=1T𝐄[ft(zt)]minz𝒦t=1Tft(z)\displaystyle=\sum_{t=1}^{T}\mathbf{E}[f_{t}(z_{t})]-\min_{z^{*}\in\mathcal{K}}\sum_{t=1}^{T}f_{t}(z^{*}) (26)
=t=1Tf^t(xt)minx𝒦¯t=1Tf^t(x)\displaystyle=\sum_{t=1}^{T}\hat{f}_{t}(x_{t})-\min_{x^{*}\in\overline{\mathcal{K}}}\sum_{t=1}^{T}\hat{f}_{t}(x^{*}) (27)
dN22η+η2t=1T𝐄[g^t22]\displaystyle\leq\frac{dN^{2}}{2\eta}+\frac{\eta}{2}\sum_{t=1}^{T}\mathbf{E}[\|\hat{g}_{t}\|_{2}^{2}] (28)
dN22η+η2T(32L^)2=34NL^dT,\displaystyle\leq\frac{dN^{2}}{2\eta}+\frac{\eta}{2}T(\frac{3}{2}\hat{L})^{2}=\frac{3}{4}N\hat{L}\sqrt{dT}, (29)

where the transition to equation (27) is justified by the equivalence 𝐄[ft(zt)]=ft^(xt)\mathbf{E}[f_{t}(z_{t})]=\hat{f_{t}}(x_{t}) as established in Lemma 1. The bound in inequality (28) is obtained from Lemma 7, and the final inequality (29) is supported by the norm condition g^t2g^t132L^\|\hat{g}_{t}\|_{2}\leq\|\hat{g}_{t}\|_{1}\leq\frac{3}{2}\hat{L}, as delineated in Lemma 6. ∎∎

3.2 Bandit Setting

In this section, we extend the online submodular minimization algorithm (Algorithm 3 in Hazan and Kale [12]) to L-convex functions to obtain upper bound on regret. Let ft:𝒦[M,M](t[T])f_{t}:\mathcal{K}\rightarrow[-M,M]~{}(\forall t\in[T]), where the function value is bounded.

We describe the subgradient descent algorithm under the bandit setting for L\mathrm{L}^{\natural}-convex functions. For each iteration tt, the algorithm identifies the maximal chain associated with xtx_{t} and its associated permutation π\pi. This allows for the representation of xtx_{t} as a convex combination. A point ztz_{t} is chosen based on probabilities ρi\rho_{i}, which are derived from coefficients μi\mu_{i} in the representation of xtx_{t} and parameter δ\delta. The cost ft(zt)f_{t}(z_{t}) at the chosen point ztz_{t} is obtained. An unbiased estimator of the subgradient g^t\hat{g}_{t} of f^t\hat{f}_{t} at xtx_{t} is computed, varying according to the value of ztz_{t}, probabilities ρi\rho_{i}, and a randomly selected εt\varepsilon_{t}. The algorithm updates xt+1x_{t+1} using the current point xtx_{t}, step size η\eta, and the estimated subgradient g^t\hat{g}_{t}. The process flow can be summarized as shown in Algorithm 3 below.

Algorithm 3 Bandit L-convex Subgradient Descent
1:parameters η,δ>0\eta,\delta>0. Let x1𝒦¯x_{1}\in\overline{\mathcal{K}} be arbitary initial point.
2:for t=1toTt=1~{}\text{to}~{}T do
3:     Find a maximal chain associated with xtx_{t}, =A0A1Ad=[d]\emptyset=A_{0}\subsetneq A_{1}\subsetneq\dots\subsetneq A_{d}=[d] and z¯\underline{z} by Algorithm 1.
4:     Choose the point ztz_{t} as follows:
zt=z¯+χAiwith  probabilityρi=(1δ)μi+δd+1.z_{t}=\underline{z}+\chi_{A_{i}}\text{with\, probability}\quad\rho_{i}=(1-\delta)\mu_{i}+\frac{\delta}{d+1}.
5:     Obtain cost ft(zt)f_{t}(z_{t}).
6:     Compute a unbiased estimator of a subgradient g^t\hat{g}_{t} of f^t\hat{f}_{t} at xtx_{t} as follows. If zt=z¯+χA0z_{t}=\underline{z}+\chi_{A_{0}}, then set g^t=1ρ0ft(zt)eπ(1)\hat{g}_{t}=-\frac{1}{\rho_{0}}f_{t}(z_{t}){e_{\pi(1)}}, and if zt=z¯+χAnz_{t}=\underline{z}+\chi_{A_{n}}, then set g^t=1ρnft(zt)eπ(n)\hat{g}_{t}=\frac{1}{\rho_{n}}f_{t}(z_{t}){e_{\pi(n)}}. Otherwise, zt=z¯+χAiz_{t}=\underline{z}+\chi_{A_{i}} for same 1in11\leq i\leq n-1. Choose εt{1,1}\varepsilon_{t}\in\{-1,1\} uniformly at random, and set:
g^t={2ρift(zt)eπ(i)ifεt=12ρift(zt)eπ(i+1)ifεt=1.\hat{g}_{t}=\left\{\begin{array}[]{ll}\frac{2}{\rho_{i}}f_{t}(z_{t}){e_{\pi(i)}}&\text{if}\quad\varepsilon_{t}=1\\ -\frac{2}{\rho_{i}}f_{t}(z_{t}){e_{\pi(i+1)}}&\text{if}\quad\varepsilon_{t}=-1.\end{array}\right.
7:     Update: set xt+1=Π𝒦¯(xtηg^t)x_{t+1}=\Pi_{\overline{\mathcal{K}}}(x_{t}-\eta\hat{g}_{t}).
8:end for

On Algorithm 3, replacing the loss function ftf_{t} with its convex extension f^t\hat{f}_{t}, the error is bounded by 2δM2\delta M.

Lemma 8

For all t[T]t\in[T], we have 𝐄[ft(zt)]𝐄[f^t(xt)]+2δM\mathbf{E}[f_{t}(z_{t})]\leq\mathbf{E}[\hat{f}_{t}(x_{t})]+2\delta M.

The proof is similar to Lemma 15 in [12].

Proof.

Let =A0A1Ad=[d],z¯𝒦\emptyset=A_{0}\subsetneq A_{1}\subsetneq\dots\subsetneq A_{d}=[d],\underline{z}\in\mathcal{K} is a maximal chain associated with xx. From Definition 7, we derive that f^t(xt)=iμift(z¯+χAi)\hat{f}_{t}(x_{t})=\sum_{i}\mu_{i}f_{t}(\underline{z}+\chi_{A_{i}}). On the other hand, 𝐄t[ft(zt)]=iρift(z¯+χAi)\mathbf{E}_{t}[f_{t}(z_{t})]=\sum_{i}\rho_{i}f_{t}(\underline{z}+\chi_{A_{i}}), and hence:

𝐄t[ft(zt)]f^t(xt)\displaystyle\mathbf{E}_{t}[f_{t}(z_{t})]-\hat{f}_{t}(x_{t}) =i=0d(ρiμi)ft(z¯+χAi)\displaystyle=\sum_{i=0}^{d}(\rho_{i}-\mu_{i})f_{t}(\underline{z}+\chi_{A_{i}}) (30)
=δi=0d(1d+1+μi)ft(z¯+χAi)\displaystyle=\delta\sum_{i=0}^{d}\left(\frac{1}{d+1}+\mu_{i}\right)f_{t}(\underline{z}+\chi_{A_{i}}) (31)
δi=0d(1d+1+μi)|ft(z¯+χAi)|\displaystyle\leq\delta\sum_{i=0}^{d}\left(\frac{1}{d+1}+\mu_{i}\right)|f_{t}(\underline{z}+\chi_{A_{i}})| (32)
2δM.\displaystyle\leq 2\delta M. (33)

Then, taking the expected value for each filtration t1\mathcal{F}_{t-1}, we obtain 𝐄[ft(zt)]𝐄[f^t(xt)]2δM\mathbf{E}[f_{t}(z_{t})]-\mathbf{E}[\hat{f}_{t}(x_{t})]\leq 2\delta M. ∎∎

For the regret analysis, we show that the unbiased estimator of the subgradient can be suppressed from above as follows.

Lemma 9

For all tt, we have 𝐄[g^t2]16M2d2δ\mathbf{E}[\|\hat{g}_{t}\|^{2}]\leq\frac{16M^{2}d^{2}}{\delta}.

Proof.

Let =A0A1Ad=[d],z¯𝒦\emptyset=A_{0}\subsetneq A_{1}\subsetneq\dots\subsetneq A_{d}=[d],\underline{z}\in\mathcal{K} is a maximal chain associated with xx. Since g^t\hat{g}_{t} is an unbiased estimator of gtg_{t}, we have 𝐄[g^t|xt]=𝐄t[g^t]=gt\mathbf{E}[\hat{g}_{t}|x_{t}]=\mathbf{E}_{t}[\hat{g}_{t}]=g_{t}. Thus, the following is obtained:

𝐄t[g^t2]i=0d4ρi2ft(z¯+χAi)2ρi4M2(d+1)2δ16M2d2δ.\mathbf{E}_{t}[\|\hat{g}_{t}\|^{2}]\leq\sum_{i=0}^{d}\frac{4}{\rho_{i}^{2}}f_{t}(\underline{z}+\chi_{A_{i}})^{2}\cdot\rho_{i}\leq\frac{4M^{2}(d+1)^{2}}{\delta}\leq\frac{16M^{2}d^{2}}{\delta}. (34)

Then, as in Lemma 8, taking the expected value for each filtration t1\mathcal{F}_{t-1}, we obtain the desired inequality 𝐄[g^t2]16M2d2δ\mathbf{E}[\|\hat{g}_{t}\|^{2}]\leq\frac{16M^{2}d^{2}}{\delta}. ∎∎

Thus, the following regret upper boundary is obtained.

Theorem 3.2

Algorithm 3, run with parameters δ=dT1/3,η=N4MT2/3\delta=\frac{d}{T^{1/3}},\eta=\frac{N}{4MT^{2/3}}, achieves the following regret bound : 𝐄[RT]6dNMT2/3\mathbf{E}[R_{T}]\leq 6dNMT^{2/3}.

It can be proved by using Lemmas 7, 8 and 9.

Proof.

Using Lemmas 7, 8 and 9, we have

𝐄[RT]\displaystyle\mathbf{E}[R_{T}] =𝐄[ft(zt)]minz𝒦t=1Tft(z)\displaystyle=\mathbf{E}[f_{t}(z_{t})]-\min_{z^{*}\in\mathcal{K}}\sum_{t=1}^{T}f_{t}(z^{*}) (35)
2δMT+t=1TE[f^t(xt)]minx𝒦¯t=1Tf^t(x)\displaystyle\leq 2\delta MT+\sum_{t=1}^{T}E[\hat{f}_{t}(x_{t})]-\min_{x^{*}\in\overline{\mathcal{K}}}\sum_{t=1}^{T}\hat{f}_{t}(x^{*}) (36)
2δMT+dN22η+η2t=1T𝐄[g^t2]\displaystyle\leq 2\delta MT+\frac{dN^{2}}{2\eta}+\frac{\eta}{2}\sum_{t=1}^{T}\mathbf{E}[\|\hat{g}_{t}\|^{2}] (37)
2δMT+dN22η+8d2M2ηTδ\displaystyle\leq 2\delta MT+\frac{dN^{2}}{2\eta}+\frac{8d^{2}M^{2}\eta T}{\delta} (38)
=2dMT2/3(2N+1)\displaystyle=2dMT^{2/3}\cdot(2N+1) (39)
6dNMT2/3,\displaystyle\leq 6dNMT^{2/3}, (40)

where the bound in inequality (36) is obtained from Lemma 8. The bound in inequality (37) is obtained from Lemma 7. The bound in inequality (38) is obtained from Lemma 9. ∎∎

4 Lower Bound on Regret

We provide a lower bound for regret, indicating that any algorithms designed for online L\mathrm{L}^{\natural}-convex minimization necessarily incur a minimum regret of Ω(L^NdT)\Omega(\hat{L}N\sqrt{dT}). This result implies that the upper bound presented in Theorem 3.1 is optimal up to a constant factor.

Theorem 4.1

For any algorithm solving online L\mathrm{L}^{\natural}-convex minimization, there exists a sequence of L\mathrm{L}^{\natural}-convex functions f1,f2,,fT:[N]df_{1},f_{2},\dots,f_{T}\colon[N]^{d}\rightarrow\mathbb{R}, with an ll_{\infty}-Lipschitz constant L^\hat{L}, such that the regret is at least Ω(L^NdT)\Omega(\hat{L}N\sqrt{dT}).

The proof is similar to Theorem 14 in Hazan and Kale [12].

Proof.

Consider a random sequence of cost functions. At each iteration t[1,T]t\in[1,T], select i(t)=(tmodd)+1[d]i(t)=(t\bmod d)+1\in[d] and a Rademacher random variable σt{1,1}\sigma_{t}\in\{-1,1\}, chosen independently from all other random variables. Define ft:[N]df_{t}\colon[N]^{d}\rightarrow\mathbb{R} as

ft(s)=σtLsi(t)(s[N]d),f_{t}(s)=\sigma_{t}Ls_{i(t)}\ \ \ (s\in[N]^{d}), (41)

where sis_{i} denotes the ii-th element of ss for i[d]i\in[d].

Since ftf_{t} is a linear function, it is an L\mathrm{L}^{\natural}-convex function as well. Furthermore, due to the properties of Rademacher random variables, it holds that 𝐄[ft(s)]=0\mathbf{E}[f_{t}(s)]=0. Consequently, the following is true for regret:

𝐄[RT]=𝐄[t=1Tft(st)]mins[N]dt=1Tft(s)=mins[N]dt=1Tft(s).\mathbf{E}[R_{T}]=\mathbf{E}\left[\sum_{t=1}^{T}f_{t}(s_{t})\right]-\min_{s\in[N]^{d}}\sum_{t=1}^{T}f_{t}(s)\\ =-\min_{s\in[N]^{d}}\sum_{t=1}^{T}f_{t}(s). (42)

To compute the regret, we construct s=argmins[N]dt=1Tft(s)s^{*}={\rm argmin}\,_{s\in[N]^{d}}\sum_{t=1}^{T}f_{t}(s) as follows:

Xi=t:i(t)=iTσt,si={1ifXi0NifXi<0.X_{i}=\sum_{t:i(t)=i}^{T}\sigma_{t},\quad s_{i}^{*}=\left\{\begin{array}[]{ll}1&\mbox{if}\ X_{i}\geq 0\\ N&\mbox{if}\ X_{i}<0.\end{array}\right. (43)

When the slope along the ii-axis of t=1Tft(s)\sum_{t=1}^{T}f_{t}(s) is positive, 11 becomes the minimal value, and when it is negative, NN becomes the minimal value. Thus, s=argmins[N]dt=1Tft(s)s^{*}=\underset{s\in[N]^{d}}{\text{argmin}}\sum_{t=1}^{T}f_{t}(s) holds. Therefore, we have

𝐄[t=1Tft(s)]=\displaystyle\mathbf{E}\left[\sum_{t=1}^{T}f_{t}(s^{*})\right]=~{} 𝐄[t=1TσtL^Si(t)]=𝐄[i=1dt:i(t)=iTσtL^Si]=𝐄[i=1dXiL^Si]\displaystyle\mathbf{E}\left[\sum_{t=1}^{T}\sigma_{t}\hat{L}S_{i}^{*}(t)\right]=\mathbf{E}\left[\sum_{i=1}^{d}\sum_{t:i(t)=i}^{T}\sigma_{t}\hat{L}S_{i}^{*}\right]=~{}\mathbf{E}\left[\sum_{i=1}^{d}X_{i}\hat{L}S_{i}^{*}\right] (44)
=\displaystyle=~{} 𝐄[L^i=1dXi(1𝕀[Xi0]+N𝕀[Xi<0])]\displaystyle\mathbf{E}\left[\hat{L}\sum_{i=1}^{d}X_{i}\left(1\cdot\mathbb{I}[X_{i}\geq 0]+N\cdot\mathbb{I}[X_{i}<0]\right)\right] (45)
=\displaystyle=~{} 𝐄[L^i=1dN12|Xi|+N+12Xi]\displaystyle\mathbf{E}\left[\hat{L}\sum_{i=1}^{d}-\frac{N-1}{2}|X_{i}|+\frac{N+1}{2}X_{i}\right] (46)
=\displaystyle=~{} Ω(L^d(N12)(Td))\displaystyle-\Omega\left(\hat{L}\cdot d\cdot\left(\frac{N-1}{2}\right)\cdot\left(\sqrt{\frac{T}{d}}\right)\right) (47)
=\displaystyle=~{} Ω(L^NdT),\displaystyle-\Omega\left(\hat{L}N\sqrt{dT}\right), (48)

where the inequality (47) is derived from Khintchine’s inequality (see, e.g., [5]) and 𝐄[Xi]=0\mathbf{E}[X_{i}]=0. ∎∎

5 Applications

This section presents several applications that can be captured in the framework of online L\mathrm{L}^{\natural}-convex minimization.

One is an extension to a natural online version of the spare parts inventory control problem. A straightforward online version of the existing model would require feedback of expected values. Requiring feedback of expected values weakens the advantage of going online, which does not require assumptions about the demand distribution. Our proposed spare-parts inventory management problem does not require expectation feedback.

The second is an application of queueing theory to the call center shift scheduling problem. In queueing theory, multimodular functions often appear, which are equivalent to L-convex functions by a simple linear transformation (unimodular transformation). As one such example, we confirm that the shift scheduling problem in a call center falls within the framework of online L-convex minimization.

5.1 Online Inventory System of Reparable Spare Parts

The spare parts inventory management problem, proposed by Miller [14], is used for parts management in aircraft maintenance, where the quantity demanded and the quantity ordered take discrete values. Miller’s model seeks to minimize the cost of manufacturing a product with d+d\in\mathbb{Z}_{+} parts, which is formulated as the sum of a fine determined by the maximum number of shortages of each part and the cost of purchasing spare parts in advance. Let cj>0c_{j}>0 be the unit price of variety j[d]j\in[d] and zj+Nz_{j}\in\mathbb{Z}_{+}\leq N be the quantity of spare parts ordered for variety jj (N+N\in\mathbb{Z}_{+} is the maximum amount that can be purchased.). The cost of purchasing spare parts is j=1dcjzj\sum_{j=1}^{d}c_{j}z_{j}. On the other hand, let the probability that the demand for variety jj is mm be φj(m)\varphi_{j}(m) and let the cumulative distribution function be Fj(k)=m=0kφj(m)F_{j}(k)=\sum_{m=0}^{k}\varphi_{j}(m), then the expected maximum number of shortages for each part is k=0(1j=1nFj(xj+k))\sum_{k=0}^{\infty}(1-\prod_{j=1}^{n}F_{j}(x_{j}+k)). Therefore, the objective of the offline spare parts inventory control problem is to solve the following problem:

minz[N]dk=0(1j=1dFj(zj+k))+j=1dcjzj.\min_{z\in[N]^{d}}\sum_{k=0}^{\infty}(1-\prod_{j=1}^{d}F_{j}(z_{j}+k))+\sum_{j=1}^{d}c_{j}z_{j}. (49)

It is known that this problem can be formulated in the framework of L-convex minimization [15].

Consider equation (49) as an online problem. After the decision is made, the sum of the purchase cost of each component and the expected maximum number of shortages of each component is obtained as feedback.

However, the expected value of the maximum number of shortages of each component is not appropriate as feedback. If it can be observed as feedback, there is no need to solve it as an online problem since the expected value is the average after sufficient iterations.

Therefore, we redefine the problem as a more natural online decision-making problem and show that it can be captured in an online L-convex minimization framework. Existing formulations consider maximizing expected profits with a known demand distribution. The expected value is given as feedback, but as a practical matter, it is inconvenient for the user to feed back the expected value. On the other hand, we do not assume a distribution, which means that we can adapt to any environment.

This model is an inventory model that minimizes the regret for long-term losses determined by the number of orders placed in each round. The problem is formulated as an online L-convex minimization problem. Each parameter is listed in Table 3.

Table 3: Parameters of the Online Inventory System for Repairable Spare Parts
Parameter Meaning Variable Type
dd\in\mathbb{Z} Number of parts Given
cdc\in\mathbb{R}^{d} Unit cost of parts Given
pp\in\mathbb{R} Penalty cost Given
zdz\in\mathbb{Z}^{d} Order quantity of parts Decision
ft(z):[N]d,(t[T])f_{t}(z):[N]^{d}\rightarrow\mathbb{R},(t\in[T]) Cost function on round tt Feedback of round tt
ytd,(t[T])y_{t}\in\mathbb{Z}^{d},(t\in[T]) Demand of parts on round tt Feedback of round tt

The loss function consists of the sum of the penalty cost determined by the maximum number of missing parts and the cost of purchasing spare parts, and is formulated as follows:

ft(z)=p(maxj[d]max(yt,jzj,0))+j=1dcjzj.f_{t}(z)=p(\max_{j\in[d]}\max(y_{t,j}-z_{j},0))+\sum_{j=1}^{d}c_{j}z_{j}. (50)

In this case, ftf_{t} is an L\mathrm{L}^{\natural}-convex function, so the problem is online L\mathrm{L}^{\natural}-convex minimization. Show that the loss function ft(x)f_{t}(x) is an L-convex function. In preparation for the proof, we introduce the following lemma.

Lemma 10 (Maxmum-component function [16])

Let p=(p1,p2,,pn)np=(p_{1},p_{2},\dots,p_{n})\in\mathbb{Z}^{n}, and let g:ng:\mathbb{Z}^{n}\rightarrow\mathbb{R}. For any τ0,τ1,,τn\tau_{0},\tau_{1},\dots,\tau_{n}\in\mathbb{R},

g(p)=max({τ0,p1+τ1,,pn+τn})g(p)=\max(\{\tau_{0},p_{1}+\tau_{1},\dots,p_{n}+\tau_{n}\})

is an L-convex function.

Next, show the follwing lemma.

Lemma 11

Let g:ng:\mathbb{Z}^{n}\rightarrow\mathbb{R}, and let f:nf:\mathbb{Z}^{n}\rightarrow\mathbb{R} is an L-convex function. g(x)=f(x)g(x)g(x)=f(-x)\Rightarrow g(x) is an L-convex function.

Proof.

We show that g(x)g(x) satisfies discrete midpoint convexity:

g(x)+g(y)\displaystyle g(x)+g(y) =f(x)+f(y)\displaystyle=f(-x)+f(-y)
f((x+y)2)+f((x+y)2)\displaystyle\geq f\left(\left\lfloor\frac{-(x+y)}{2}\right\rfloor\right)+f\left(\left\lceil\frac{-(x+y)}{2}\right\rceil\right)
=f((x+y)2)+f((x+y)2)\displaystyle=f\left(-\left\lceil\frac{-(x+y)}{2}\right\rceil\right)+f\left(-\left\lfloor\frac{-(x+y)}{2}\right\rfloor\right)
=g(x+y2)+g(x+y2).\displaystyle=g\left(\left\lceil\frac{x+y}{2}\right\rceil\right)+g\left(\left\lfloor\frac{x+y}{2}\right\rfloor\right).

∎∎

We show that the loss function is an L-convex function by using Lemma 10 and Lemma 11.

Lemma 12
ft(x)=p(maxj[n]max(yjxj,0))+j=1ncjxjf_{t}(x)=p(\max_{j\in[n]}\max(y_{j}-x_{j},0))+\sum_{j=1}^{n}c_{j}x_{j}

is an L-convex function.

Proof.

By using Lemma 10 and Lemma 11, maxj[n]max(yjxj,0)\max_{j\in[n]}\max(y_{j}-x_{j},0) is an L-convex function. j=1ncjxj\sum_{j=1}^{n}c_{j}x_{j} is an L-convex function. Thus, ft(x)f_{t}(x) is an L-convex function.∎∎

5.2 Shift Scheduling with a Global Service Level Constraint

Here, we introduce the shift scheduling with a global service level constraint proposed by Koole and van der Sluis [13] from queueing theory and its online implementation.

Consider one service center. This model aims to minimize the total loss and labor costs due to overall service level degradation over time. A call center operates over II time intervals. Denote the time intervals by i=1,,Ii=1,\dots,I. Each operator works for MM consecutive time intervals. There are KK types of work shifts, and the starting point of each is specified in advance. Shifts are denoted by k=1,,Kk=1,\dots,K. and the starting time interval of a KK shift is denoted by IkI_{k}. For each iteration tt and for each time interval ii, there is a function gt,i(ni)g_{t,i}(n_{i}) that represents the service level in that interval. where nin_{i} is the number of operators working in time interval ii. The function gig_{i} is a monotonically increasing concave function. For each iteration tt, the overall service level StS_{t} is given by St=1iIgt,i(ni)S_{t}=\sum_{1\leq i\leq I}g_{t,i}(n_{i}). Each parameter is listed in Table 4.

Table 4: Parameters for Shift Scheduling with a Global Service Level Constraint
Parameter Meaning Variable Type
N+N\in\mathbb{Z}_{+} Number of operators Given
K+K\in\mathbb{Z}_{+} Number of shift types Given
I+I\in\mathbb{Z}_{+} Duration interval Given
lKl\in\mathbb{Z}^{K} Labor costs for shifts Given
y[N]Ky\in[N]^{K} Number of people assigned to shifts Decision
cc\in\mathbb{Z} Limit time to keep customers waiting Given
GG\in\mathbb{Z} Projected profit if customers are satisfied Given
r[0,1]r\in[0,1]
Probability of not making a profit
if the customer is not satisfied
Given

Let yky_{k} be the number of operators to be placed in the kk-th shift, then the number of operators in time interval ii, hih_{i} is hi(y)=k:iM<Ikiykh_{i}(y)=\sum_{k:i-M<I_{k}\leq i}y_{k}. Thus, the service level SS is equal to

St(y)=1iIgt,i(hi(y)).S_{t}(y)=\sum_{1\leq i\leq I}g_{t,i}(h_{i}(y)). (51)

Let λ\lambda be the arrival rate of customers for each time interval and μ\mu be the service rate of the operator. Furthermore, by setting a threshold cc for the limit of time to keep a customer waiting, the percentage of customers who connect to an operator within this time can be defined as a service level based on the queueing model M/M/nM/M/n. Therefore, the objective function ftf^{\prime}_{t} to be minimized is as follows:

ft(y):=GrSt(y)+1kKlkyk.f^{\prime}_{t}(y):=-GrS_{t}(y)+\sum_{1\leq k\leq K}l_{k}y_{k}. (52)

This objective function is known to be a multimodular function [13]. Here, the multimodular function ftf^{\prime}_{t} becomes L\mathrm{L}^{\natural}-convex ftf_{t} by unimodular transformation and can be solved in the framework of an online L\mathrm{L}^{\natural}-convex minimization problem. The online decision-making problem of this model is that the allocation yy to a shift can be determined to minimize the long-term loss without a priori knowledge of the function gig_{i} representing the service level in the time interval ii.

6 Conclusions

We proposed computationally efficient algorithms for online L\mathrm{L}^{\natural}-convex minimization, which extends online submodular minimization. Our algorithms apply for two major settings: the full information setting and the bandit setting. We provided regret analyses of our algorithms and lower bound for the regrets, and in particular showed that in the full information setting our proposed algorithm achieves a tight regret bound up to a constant factor. We also demonstrated that the online L\mathrm{L}^{\natural}-convex minimization naturally captures various problems, including the spare parts inventory management problem and the shift scheduling problem.

Acknowledgement

This work was partially supported by the JST ERATO Grant Number JPMJER2301. Additionally, portions of this research were conducted during visits by the first author, Yokoyama, and the fourth author, Kimura, to NEC Corporation.

References

  • [1] Altman, E.: Discrete-event Control of Stochastic Networks: Multimodularity and Regularity. Springer Science & Business Media (2003)
  • [2] Altman, E., Gaujal, B., Hordijk, A.: Multimodularity, convexity, and optimization properties. Mathematics of Operations Research 25(2), 324–347 (2000)
  • [3] Bach, F.: Learning with submodular functions: A convex optimization perspective. Foundations and Trends® in machine learning 6(2-3), 145–373 (2013)
  • [4] Bubeck, S., Eldan, R., Lee, Y.T.: Kernel-based methods for bandit convex optimization. Journal of the ACM 68(4), 1–35 (2021)
  • [5] Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press (2006)
  • [6] Chen, L., Hassani, H., Karbasi, A.: Online continuous submodular maximization. In: International Conference on Artificial Intelligence and Statistics. pp. 1896–1905. PMLR (2018)
  • [7] Chen, X.: L\mathrm{L}^{\natural}-convexity and its applications in operations. Frontiers of Engineering Management 4(3), 283–294 (2017)
  • [8] Chen, X., Li, M.: Discrete convex analysis and its applications in operations: A survey. Production and Operations Management 30(6), 1904–1926 (2021)
  • [9] Freund, D., Henderson, S.G., Shmoys, D.B.: Minimizing multimodular functions and allocating capacity in bike-sharing systems. In: 19th International Conference on Integer Programming and Combinatorial Optimization. pp. 186–198. Springer (2017)
  • [10] Fujishige, S.: Submodular Functions and Optimization. Elsevier (2005)
  • [11] Fujishige, S., Murota, K.: Notes on L-/M-convex functions and the separation theorems. Mathematical Programming 88, 129–146 (2000)
  • [12] Hazan, E., Kale, S.: Online submodular minimization. Journal of Machine Learning Research 13(10) (2012)
  • [13] Koole, G., van der Sluis, E.: Optimal shift scheduling with a global service level constraint. IIE Transactions 35(11), 1049–1055 (2003)
  • [14] Miller, B.L.: On minimizing nonseparable functions defined on the integers with an inventory application. SIAM Journal on Applied Mathematics 21(1), 166–185 (1971)
  • [15] Moriguchi, S., Murota, K.: Discrete Hessian matrix for L-convex functions. IEICE transactions on fundamentals of electronics, communications and computer sciences 88(5), 1104–1108 (2005)
  • [16] Murota, K.: Discrete convex analysis. Mathematical Programming 83, 313–371 (1998)
  • [17] Murota, K.: Discrete Convex Analysis. SIAM (2003)
  • [18] Murota, K.: Discrete convex analysis: A tool for economics and game theory. arXiv preprint arXiv:2212.03598 (2022), (Preliminary version: Murota, K.: Discrete convex analysis: A tool for economics and game theory. The Journal of Mechanism and Institution Design 1(1), 151–273 (2016))
  • [19] Qin, L., Chen, S., Zhu, X.: Contextual combinatorial bandit and its application on diversified online recommendation. In: Proceedings of the 2014 SIAM International Conference on Data Mining. pp. 461–469. SIAM (2014)
  • [20] Sakaue, S., Oki, T.: Rethinking warm-starts with predictions: Learning predictions close to sets of optimal solutions for faster L\mathrm{L}-/L\mathrm{L}^{\natural}-convex function minimization. In: International Conference on Machine Learning. pp. 29760–29776. PMLR (2023)
  • [21] Shalev-Shwartz, S.: Online learning and online convex optimization. Foundations and Trends® in Machine Learning 4(2), 107–194 (2012)
  • [22] Shioura, A.: Algorithms for L-convex function minimization: Connection between discrete convex analysis and other research fields. Journal of the Operations Research Society of Japan 60(3), 216–243 (2017)
  • [23] Tsuchiya, T., Ito, S., Honda, J.: Further adaptive best-of-both-worlds algorithm for combinatorial semi-bandits. In: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. pp. 8117–8144. PMLR (2023)
  • [24] Zhang, H., Zheng, Z., Lavaei, J.: Stochastic L\mathrm{L}^{\natural}-convex function minimization. Advances in Neural Information Processing Systems 34, 13004–13018 (2021)