Stochastic $k$ -Submodular Bandits with Full Bandit Feedback

Guanyu Nie Vaneet Aggarwal Christopher John Quinn

Abstract

In this paper, we present the first sublinear $\alpha$ -regret bounds for online $k$ -submodular optimization problems with full-bandit feedback, where $\alpha$ is a corresponding offline approximation ratio. Specifically, we propose online algorithms for multiple $k$ -submodular stochastic combinatorial multi-armed bandit problems, including (i) monotone functions and individual size constraints, (ii) monotone functions with matroid constraints, (iii) non-monotone functions with matroid constraints, (iv) non-monotone functions without constraints, and (v) monotone functions without constraints. We transform approximation algorithms for offline $k$ -submodular maximization problems into online algorithms through the offline-to-online framework proposed by Nie et al. (2023a). A key contribution of our work is analyzing the robustness of the offline algorithms.

Machine Learning, ICML

1 INTRODUCTION

In various sequential decision problems, including sensor placement, influence maximization, and clinical trials, decisions are made by sequentially selecting a subset of elements, making assignments for those elements, and then observing the outcomes. These scenarios often exhibit diminishing returns based on the choice of the subset of elements and their corresponding assignments.

Consider a multi-agent scenario where multiple companies (agents) cooperate to spread $k$ different types of content across a social network. Each company can select a subset of users and control how it distributes content to initiate propagation. The collective goal is to maximize the overall spread of all content types across the network. However, each company’s decisions impact not only its own success but also that of others. For instance, if two companies target users with overlapping follower networks, their efforts may lead to redundant influence, limiting the potential additional spread. Conversely, well-coordinated strategies could generate synergistic effects, amplifying the overall reach. This redundancy effect exists even for individual companies, making it crucial to target diverse users to avoid diminishing returns.

Each type of content follows an independent propagation process. Due to privacy restrictions, the companies only know the total number of users influenced by the content, without visibility into the underlying diffusion process. Additionally, because the diffusion is inherently random, the observations may vary, necessitating a balance between exploiting successful configurations (exploitation) and experimenting with different users and assignments to discover better strategies (exploration). See Appendix D in the supplementary material for more motivating applications.

For each company, an offline variant of this sequential decision problem (i.e., where the agent has access to an exact value oracle and is only evaluated on a final output) can be modeled as a (un)constrained $k$ -submodular optimization problem (Huber & Kolmogorov, 2012). The class of $k$ -submodular functions generalizes the class of submodular set functions, an important non-linear function class for numerous optimization problems exhibiting a notion of diminishing returns, by not only accounting for the value of a subset of elements (e.g., which users) but also the assignment (e.g., which types of content to which users) as well. Maximizing a $k$ -submodular function is known to be NP-hard even for unconstrained problems (Ward & Živný, 2014). There has been significant progress in developing approximation algorithms for various offline $k$ -submodular maximization problems (Iwata et al., 2015; Ohsaka & Yoshida, 2015; Sakaue, 2017).

The sequential decision problem described above can be modeled as a stochastic combinatorial Multi-Armed Bandit (CMAB) problem with (i) an expected reward that is $k$ -submodular, (ii) an action space formed by the constraints (if any) on elements and their assignments, and (iii) bandit feedback. Multi-Armed Bandits is a classical framework for sequential decision-making under uncertainty. Combinatorial MAB is a specialized branch where each action is a combination of base arms, known as a “super arm”, and the number of super-arms is prohibitively large to try each one.

For CMAB problems with non-linear rewards, the nature of the feedback significantly affects the problem complexity. “Semi-bandit” feedback involves observing additional information (such as the value of individual arms) aside from joint reward of the selected super arm, which greatly simplifies learning. In contrast, “bandit” or “full-bandit” feedback observes only the (joint) reward for actions. Estimating arm values for non-linear rewards with bandit feedback requires deliberate sampling of sub-optimal actions (e.g., such as actions consisting of individual base arms when the function is monotone), which is not typically done in standard MAB methods, since by design they do not take actions identified (with some confidence) to be sub-optimal.

In this paper, we address the problem of stochastic CMAB with $k$ -submodular expected rewards, various constraints, and only bandit feedback.

Our Contributions: We propose and analyze the first CMAB algorithms with sub-linear $\alpha$ -regret for $k$ -submodular rewards using only full-bandit feedback. For each of the following results, we achieve them in part through analyzing the robustness of respective offline algorithms.

•

For non-monotone $k$ -submodular rewards and no constraints, we propose a CMAB algorithm with $\tilde{\mathcal{O}}\left(nk^{\frac{1}{3}}T^{\frac{2}{3}}\right)$ $1/2$ -regret.
•

For monotone $k$ -submodular rewards and no constraints, we propose a CMAB algorithm with $\tilde{\mathcal{O}}\left(nk^{\frac{1}{3}}T^{\frac{2}{3}}\right)$ $k/(2k-1)$ -regret.
•

For monotone $k$ -submodular rewards under individual size constraints, we propose a CMAB algorithm with $\tilde{\mathcal{O}}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}BT^{\frac{2}{3}}\right)$ $1/3$ -regret.
•

For monotone $k$ -submodular rewards under a matroid constraint, we propose a CMAB algorithm with $\tilde{\mathcal{O}}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}MT^{\frac{2}{3}}\right)$ $1/2$ -regret. We specialize our CMAB algorithm for monotone functions with a matroid constraints to the case of total size constraints.
•

For non-monotone $k$ -submodular rewards under a matroid constraint, we propose a CMAB algorithm with $\tilde{\mathcal{O}}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}MT^{\frac{2}{3}}\right)$ $1/3$ -regret.

In the offline setting, it is important to highlight that many of the properties of $f$ no longer hold in the presence of noise. For instance, in the case of a monotone function $f$ , the marginal gains of $\hat{f}$ (noisy version of $f$ ) may no longer be guaranteed to be positive. Similarly, for non-monotone objectives, the property of pairwise monotonicity (see Section 2) for $f$ might not remain valid for $\hat{f}$ . As a result, many proof steps in the original paper(s) that introduced those offline algorithms do not go through directly and, consequently, need novel analysis. To address these challenges, this study incorporates bounded error to establish properties akin to the original ones in their respective contexts.

Analyzing the robustness of the offline algorithms is important, even without transforming them into an online setting. For those algorithms that possess inherent robustness, we refrain from modifications and maintain the original offline algorithm to ensure an efficient offline-to-online transformation. However, substantial modifications may be necessary for the original offline algorithm to maintain robustness in the presence of noise (e.g., Section 4). In such cases, we aim to introduce minimal alterations to the offline algorithm. We note that these adjustments require careful design, which is a part of the novelty of this paper.

Remark 1.1.

The regret guarantees we obtained in this work are all of order $\mathcal{O}(T^{2/3})$ . It is unknown whether $O(\sqrt{T})$ expected cumulative $\alpha$ -regret is possible even in the case $k=1$ . The only known results with bandit feedback for $k=1$ are $O(T^{2/3})$ for submodular bandits (Streeter & Golovin, 2008; Niazadeh et al., 2021). While some lower bounds of $O(T^{2/3})$ exist (e.g., in adversarial setup with cardinality constraint (Niazadeh et al., 2021), or stochastic setup with cardinality constraint where regret is defined as gap to greedy algorithm (Tajdini et al., 2023)), they are for specialized setups even for $k=1$ , and thus a general understanding of lower bounds in these setups is an open problem.

Table 1: Summary of offline

\alpha

-approximation algorithms for

k

-submodular maximization with our

\delta

-robustness analysis and

\alpha

-regret bounds for our proposed algorithms for

k

-submodular CMAB with full-bandit feedback.

N

is an upper bound on the query complexity of the offline algorithm.

B

is the total budget.

M

is the rank of the matriod. There are no prior sublinear

\alpha

-regret bounds for

k

-submodular CMAB with full-bandit feedback. * Sun et al. (2022)’s result for non-monotone

f

with matroid constraints specializes to the first results for non-monotone

f

with total size constraints.

\dagger

Both the

1/2

approximation and the

\delta

we obtain by analyzing the robustness of Ohsaka & Yoshida (2015)’s offline approximation algorithm for monotone

f

with total size constraints are strictly generalized by Sakaue (2017)’s offline approximation algorithm for monotone

f

with matroid constraints.

Ref.	Objective $f$	Constraint	$\alpha$	$\delta$	$N$	Our $\alpha$ -regret
Iwata et al. (2015)	Non-Monotone	Unconstrained	$1/2$	$20n$	$nk$	$\tilde{\mathcal{O}}\left(nk^{\frac{1}{3}}T^{\frac{2}{3}}\right)$
Iwata et al. (2015)	Monotone	Unconstrained	$k/(2k-1)$	$(16-\frac{2}{k})n$	$nk$	$\tilde{\mathcal{O}}\left(nk^{\frac{1}{3}}T^{\frac{2}{3}}\right)$
Ohsaka & Yoshida (2015) $\dagger$	Monotone	Total Size	$1/2$	$B+1$	$nkB$	$\tilde{\mathcal{O}}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}BT^{\frac{2}{3}}\right)$
Sun et al. (2022)*	Non-Monotone	Total Size	$1/3$	$4/3(B+1)$	$nkB$	$\tilde{\mathcal{O}}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}BT^{\frac{2}{3}}\right)$
Ohsaka & Yoshida (2015)	Monotone	Individual Size	$1/3$	$4/3(B+1)$	$nkB$	$\tilde{\mathcal{O}}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}BT^{\frac{2}{3}}\right)$
Sakaue (2017) $\dagger$	Monotone	Matroid	$1/2$	$M+1$	$nkM$	$\tilde{\mathcal{O}}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}MT^{\frac{2}{3}}\right)$
Sun et al. (2022)	Non-Monotone	Matroid	$1/3$	$4/3(M+1)$	$nkM$	$\tilde{\mathcal{O}}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}MT^{\frac{2}{3}}\right)$

Related works: We briefly highlight closely related works. See Appendix C for an expanded discussion.

$k$ -submodular CMAB

To the best of our knowledge, the only prior work for $k$ -submodular CMAB for any constraint type and/or feedback model is presented in (Soma, 2019). They considered unconstrained $k$ -submodular maximization under semi-bandit feedback in adversarial setting. For the non-monotone and monotone cases, they propose algorithms with $1/2$ and $\frac{k}{2k-1}$ regrets upper bounded by $\mathcal{O}(nk\sqrt{T})$ respectively. Their approach is based on Blackwell approachability (Blackwell, 1956). As discussed in the introduction, availability of semi-bandit feedback in CMAB problems with non-linear rewards significantly simplifies the learning problem (i.e., with semi-bandit feedback $\sqrt{T}$ $\alpha$ -regret bound dependence is typically easy to achieve). We also note that the adversarial CMAB setting does not strictly generalize the stochastic setting. In the adversarial setting, the reward functions at each time step $\{f_{t}\}$ must exhibit $k$ -submodularity but otherwise can vary widely. In the stochastic setting, the individual $f_{t}$ ’s are not necessarily $k$ -submodular but the expected reward function, represented as $f=\mathbb{E}[f_{t}]$ , is.

Submodular CMAB

For submodular rewards (i.e., $k=1$ ), Streeter & Golovin (2008) proposed and analyzed an algorithm for adversarial CMAB with submodular rewards, full-bandit feedback, and under a knapsack constraint (though only in expectation, taken over randomness in the algorithm). The authors adapted a simpler greedy algorithm (Khuller et al., 1999), using an $\epsilon$ -greedy exploration type framework. Niazadeh et al. (2021) proposed a framework for transforming iterative greedy $\alpha$ -approximation algorithms for offline problems to online methods in an adversarial bandit setting, for both semi-bandit (achieving $\widetilde{O}(T^{1/2})$ $\alpha$ -regret) and full-bandit feedback (achieving $\widetilde{O}(T^{2/3})$ $\alpha$ -regret). In the stochastic setting, Nie et al. (2022) proposed an explore-then-commit type algorithm for online monotone submodular maximization under cardinality constraint. The result is extended with an optimized stochastic-explore-then-commit approach in Fourati et al. (2024b). Fourati et al. (2023) proposed randomized greedy learning algorithm for online non-monotone submodular maximization. Nie et al. (2023a) recently proposed a general framework for adapting offline to online algorithms under full bandit feedback. Their framework require the adapted offline algorithm to satisfy the so called $(\alpha,\delta)$ -robustness property.

There are also a number of works that require additional “semi-bandit” feedback. For combinatorial MAB with submodular rewards, a common type of semi-bandit feedback are marginal gains (Lin et al., 2015; Yue & Guestrin, 2011; Yu et al., 2016; Takemori et al., 2020), which enable the learner to take actions of maximal cardinality or budget, receive a corresponding reward, and gain information not just on the set but individual elements.

2 BACKGROUND

2.1 $k$ -Submodular Functions:

Let $k$ be a positive integer for the number of types (i.e., types of stories) and $V=[n]$ be the ground set of elements (i.e., users in a social network). Let $(k+1)^{V}:=\{(X_{1},\ldots,X_{k})|X_{i}\subseteq V,i\in\{1,\ldots,k\},X_{i}\cap X_{j}=\emptyset,\forall i\neq j\}$ . A function $f:(k+1)^{V}\rightarrow\mathbb{R}$ is called $k$ -submodular if, for any $\boldsymbol{x}=(X_{1},\ldots,X_{k})$ and $\boldsymbol{y}=(Y_{1},\ldots,Y_{k})$ in $(k+1)^{V}$ , we have

f(\boldsymbol{x})+f(\boldsymbol{y})\geq f(\boldsymbol{x}\sqcup\boldsymbol{y})+f(\boldsymbol{x}\sqcap\boldsymbol{y})

where $\boldsymbol{x}\sqcap\boldsymbol{y}:=(X_{1}\cap Y_{1},\ldots,X_{k}\cap Y_{k}),\boldsymbol{x}\sqcup\boldsymbol{y}:=(X_{1}\cup Y_{1}\setminus(\cup_{i\neq 1}X_{i}\cup Y_{i}),\ldots,X_{k}\cup Y_{k}\setminus(\cup_{i\neq k}X_{i}\cup Y_{i})).$ Note that setting $k=1$ in these definitions recovers submodular functions, set intersection, and set union, respectively. We further define a relation $\preceq$ on $(k+1)^{V}$ so that, for $\boldsymbol{x}=(X_{1},\ldots,X_{k})$ and $\boldsymbol{y}=(Y_{1},\ldots,Y_{k})$ in $(k+1)^{V},\boldsymbol{x}\preceq\boldsymbol{y}$ if $X_{i}\subseteq Y_{i}$ for every $i$ with $i\in[k]$ . We also define the marginal gain of assigning type $i\in[k]$ to element $e$ given a current solution $\boldsymbol{x}$ (provided that $e$ has not been assigned any type in $\boldsymbol{x}$ ),

	$\displaystyle\Delta_{e,i}f(\boldsymbol{x})$	$\displaystyle=f(X_{1},\ldots,X_{i-1},X_{i}\cup\{e\},X_{i+1},\ldots,X_{k})$
		$\displaystyle\qquad-f(X_{1},\ldots,X_{k})$

for $\boldsymbol{x}\in(k+1)^{V},e\notin\bigcup_{\ell\in[k]}X_{\ell}$ , and $i\in[k]$ .

Theorem 2.1.

(Ward & Živný, 2014) A function $f:(k+1)^{V}\rightarrow\mathbb{R}$ is $k$ -submodular if and only if $f$ satisfies the following two conditions:
Orthant submodularity: $\Delta_{e,i}f(\boldsymbol{x})\geq\Delta_{e,i}f(\boldsymbol{y})$ for any $\boldsymbol{x},\boldsymbol{y}\in(k+1)^{V}$ with $\boldsymbol{x}\preceq\boldsymbol{y},e\notin\bigcup_{\ell\in[k]}Y_{\ell}$ , and $i\in[k]$ ;
Pairwise monotonicity: $\Delta_{e,i}f(\boldsymbol{x})+\Delta_{e,j}f(\boldsymbol{x})\geq 0$ for any $\boldsymbol{x}\in(k+1)^{V},e\notin\bigcup_{\ell\in[k]}X_{\ell}$ , and $i,j\in[k]$ with $i\neq j$ .

We define the support of $\boldsymbol{x}\in(k+1)^{V}$ as $\operatorname{supp}(\boldsymbol{x})=\{e\in V\mid\boldsymbol{x}(e)\neq 0\}$ and define the support of type $i\in[k]$ as $\operatorname{supp}_{i}(\boldsymbol{x})=\{e\in V\mid\boldsymbol{x}(e)=i\}$ . Further, a function $f:(k+1)^{V}\rightarrow\mathbb{R}$ is called monotone if $\Delta_{e,i}f(\boldsymbol{x})\geq 0$ for any $\boldsymbol{x}\in(k+1)^{V},e\notin\bigcup_{\ell\in[k]}X_{\ell}$ , and $i\in[k]$ .

Matroids

For a finite set $E$ and $\mathcal{F}\subseteq 2^{E}$ , we say a system $(E,\mathcal{F})$ is a matroid if the following hold:

•

(M1) $\emptyset\in\mathcal{F}$ ,
•

(M2) If $A\subseteq B\in\mathcal{F}$ then $A\in\mathcal{F}$ ,
•

(M3) If $A,B\in\mathcal{F}$ and $|A|<|B|$ then there exists $e\in B\backslash A$ such that $A\cup\{e\}\in\mathcal{F}$ .

The elements of $\mathcal{F}$ are called independent, and we say $A\in\mathcal{F}$ is maximal if no $B\in\mathcal{F}$ satisfies $A\subsetneq B$ . A maximal independent set $B$ is called a basis of the matroid. The rank function of a matroid $\mathcal{M}=(E,\mathcal{F})$ is defined as $r_{\mathcal{M}(A)}=\max\{|S|:S\subseteq A,S\in\mathcal{F}\}$ . Under the matroid constraint with a matroid $\mathcal{M}=(E,\mathcal{F})$ , a solution $\boldsymbol{x}\in(k+1)^{V}$ is feasible if $\boldsymbol{x}\in\mathcal{F}$ .

A simple but important matroid constraint $\mathcal{M}=(E,\mathcal{F})$ is the uniform matroid in which $\mathcal{F}=\{X\in V:|X|\leq B\}$ for a given $B$ . It is equivalent to the Total Size (TS) constraint in the problem we consider.

2.2 CMAB

In the CMAB framework, we consider sequential, combinatorial decision-making problems over a finite time horizon $T$ . Let $\Omega$ denote the ground set of base arms and $n=|\Omega|$ denote the number of arms. Let $D\subseteq 2^{\Omega}$ denote the subset of feasible actions, for which we presume membership can be efficiently evaluated. We will use the terminologies subset and action interchangeably throughout the paper.

At each time step $t$ , the learner selects a feasible action $A_{t}\in D$ . After the subset $A_{t}$ is selected, the learner receives a reward $f_{t}(A_{t})$ . We assume the reward $f_{t}$ is stochastic, bounded in $[0,1]$ , and i.i.d. conditioned on the action $A_{t}$ . Define the expected reward function as $f(A):=\mathbb{E}[f_{t}(A)]$ . The goal of the learner is to maximize the cumulative reward $\sum_{t=1}^{T}f_{t}(A_{t})$ . To measure the performance of the algorithm, one common metric is to compare the learner to an agent with access to a value oracle for $f$ . However, if optimizing $f$ over $D$ is NP-hard, such a comparison would not be meaningful unless the horizon is exponentially large in the problem parameters. Alternatively, if there exists a known approximation algorithm $\mathcal{A}$ with an approximation ratio $\alpha\in(0,1]$ for optimizing $f$ over $D$ , it is more natural to evaluate the performance of a CMAB algorithm against what $\mathcal{A}$ could achieve. In this scenario, we consider the expected cumulative $\alpha$ -regret $\mathcal{R}_{\alpha,T}$ , which quantifies the difference between $\alpha$ times the cumulative reward of the optimal subset’s expected value and the average received reward, (we write $\mathcal{R}_{T}$ when $\alpha$ is understood from context)

\displaystyle\mathbb{E}[\mathcal{R}_{T}]=\alpha Tf(\mathrm{OPT})-\mathbb{E}\left[\sum_{t=1}^{T}f_{t}(A_{t})\right],

(1)

where OPT is the optimal solution, i.e., $\text{OPT}\in\operatorname*{arg\,max}_{A\in D}f(A)$ and the expectations are with respect to both the rewards and actions (if random).

2.3 Problem Statement

We consider the sequential decision making problem under the stochastic CMAB framework where the expected reward function $f$ is $k$ -submodular. Each arm consists of a tuple (we call it an item-type pair): $(e,i)\in V\times[k]$ , and a super arm is defined as $\boldsymbol{x}\in(k+1)^{V}$ , which is a combination of base arms. We consider full-bandit feedback, where after each time an action $A_{t}$ is selected, the learner can only observe $f_{t}(A_{t})$ as reward. We aim to transform various offline algorithms in $k$ -submodular optimization literature to online algorithms, thus we consider $\alpha$ regret as our performance metric and $\alpha$ is defined to be the approximation ratio of the corresponding offline algorithm.

2.4 Offline-to-Online Framework

Algorithm 1 C-ETC (Nie et al., 2023a)

1: Input: horizon

T

, set

\Omega

n

base arms, an offline

(\alpha,\delta)

-robust algorithm

\mathcal{A}

, and an upper-bound

N

on the number of queries

\mathcal{A}

will make to the value oracle.

2: Initialize

m\leftarrow\left\lceil\frac{\delta^{2/3}T^{2/3}\log(T)^{1/3}}{2N^{2/3}}\right\rceil

3: // Exploration Phase //

4: while

\mathcal{A}

queries the value of some

A\subseteq\Omega

5: For

m

times, play action

A

6: Calculate the empirical mean

\bar{f}

7: Return

\bar{f}

\mathcal{A}

8: end while

9: // Exploitation Phase //

10: for remaining time do

11: Play action

S

output by algorithm

\mathcal{A}

12: end for

As we adopt the novel offline-to-online transformation framework proposed in (Nie et al., 2023a), in this section, we briefly introduce the framework. In (Nie et al., 2023a), they introduced a criterion for the robustness of an offline approximation algorithm. They showed that this property alone is sufficient to guarantee that the offline algorithm can be adapted to solve CMAB problems in the corresponding online setting with just bandit feedback and achieve sub-linear regret. More importantly, the CMAB adaptation will not rely on any special structure of the algorithm design, instead employing it as a black box. We restate the robustness definition in the following. This definition of robustness indicates that the algorithm $\mathcal{A}$ will maintain reasonably good performance when function evaluations have errors.

Definition 2.2 ( $(\alpha,\delta,N)$ -Robust Approximation (Nie et al., 2023a)).

An algorithm (possibly random) $\mathcal{A}$ is an $(\alpha,\delta,N)$ -robust approximation algorithm for the combinatorial optimization problem of maximizing a function $f:2^{\Omega}\to\mathbb{R}$ over a finite domain $D\subseteq 2^{\Omega}$ if its output $S^{*}$ using a value oracle for $\hat{f}$ satisfies the relation below with the optimal solution $\mathrm{OPT}$ under $f$ , provided that for any $\epsilon>0$ that $|f(S)-\hat{f}(S)|\leq\epsilon$ for all $S\in D$ ,

\displaystyle\mathbb{E}[f(S^{*})]\geq\alpha f(\mathrm{OPT})-\delta\epsilon,

where $\Omega$ is the ground set, the expectation is over the randomness of the algorithm $\mathcal{A}$ , and algorithm $\mathcal{A}$ uses at most $N$ value oracle queries.

Note that when we have access to the exact oracle ( $\varepsilon=0$ ), the definition will give us the same guarantee as the original offline algorithm; if the offline algorithm is exact, $\alpha=1$ . Equipped with a robustness assurance, the authors introduced a stochastic CMAB algorithm named “Combinatorial Explore-Then-Commit” (C-ETC). See Algorithm 1 for the pseudocode. C-ETC interfaces with an offline $(\alpha,\delta,N)$ -robust algorithm denoted as $\mathcal{A}$ . In the exploration phase, when $\mathcal{A}$ requests information from the value oracle pertaining to action $A$ , C-ETC adopts a strategy of executing action $A$ multiple times, specifically $m$ times, where $m$ is an optimization parameter. Subsequently, C-ETC computes the empirical mean, represented as $\bar{f}$ , of the rewards associated with action $A$ and communicates this computed value back to the offline algorithm $\mathcal{A}$ . In the exploitation phase, C-ETC continuously deploys the solution $S$ generated by the $\mathcal{A}$ algorithm. They showed the following theorem:

Theorem 2.3.

(Nie et al., 2023a) ¹¹1Nie et al. (2023a)’s results were for deterministic offline approximation algorithms. See https://arxiv.org/pdf/2301.13326.pdf for an extension to randomized offline approximation algorithms. The expected cumulative $\alpha$ -regret of C-ETC using an $(\alpha,\delta,N)$ -robust approximation algorithm $\mathcal{A}$ as a subroutine is at most $\mathcal{O}\left(\delta^{\frac{2}{3}}N^{\frac{1}{3}}T^{\frac{2}{3}}\log(T)^{\frac{1}{3}}\right)$ with $T\geq\max\left\{N,\frac{2\sqrt{2}N}{\delta}\right\}$ .

This approach allows for a CMAB procedure that does not require any specialized structural characteristics from $\mathcal{A}$ . It necessitates no intricate construction beyond the execution of $\mathcal{A}$ , provided the criteria of robustness (Definition 2.2) are met.

Remark 2.4.

We note that this result can be extended to multiple agents, where the exploration ( $m$ times play of action $A$ ) happens in a distributed manner over the agents as shown in Fourati et al. (2024a).

In the following sections, we transform various offline algorithms for $k$ -submodular maximization problems under different constraints to online bandit setting by analyzing the robustness of the offline algorithms. In analyzing the robustness of offline algorithms, we assume the offline algorithm is evaluated with a surrogate function $\hat{f}$ with $|f(S)-\hat{f}(S)|\leq\epsilon$ for all $S\in D$ , instead of the exact function $f$ . We will highlight some parts of the proof for non-monotone unconstrained problem, and defer all missing proofs to the appendix.

Remark 2.5 (Offline v.s. Online Algorithms).

We note the following major differences between offline algorithms and online algorithms. First, an offline algorithm assumes exact oracle access to the objective function $f$ . However, in certain scenarios, the objective might not be predetermined. Nonetheless, if we engage in recurrent tasks while encountering objectives $f_{1},f_{2},\cdots,f_{T}$ , potentially sampled from a distribution, there is a possibility of acquiring the ability to perform satisfactorily on average over time. Such algorithms are referred to as online algorithms. Second, offline algorithms are designed to optimize an objective $f$ in the sense of finding a “final” solution, but the quality of intermediate solutions does not matter. Those offline algorithms care about computational and sample complexity that depends on problem size, but there is no sense of a horizon explicitly. In contrast, the aim of online algorithms is not to output a single solution (there is “simple regret” but we consider more common cumulative regret) but the sum of achieved values, so intermediate solutions matter and the horizon plays an explicit role.

3 Non-monotone Functions without Constraints

In this section, we consider the case where the expected objective function is non-monotone $k$ -submodular and there is no constraint in selecting actions. We adopt the offline algorithm proposed in Iwata et al. (2015).

3.1 Algorithm

We first remark the following key fact for (monotone or non-monotone) $k$ -submodular maximization without constraints (see Iwata et al. (2013) for the proof):

Proposition 3.1.

For any $k$ -submodular function $f:(k+1)^{V}\rightarrow\mathbb{R}_{+}$ with $k\geq 2$ , there exists a partition of $V$ that attains the maximum value of $f$ .

This says that there is an optimal solution with all elements included (a combinatorial search is needed to determine their types). Proposition 3.1 remains valid for non-monotone functions due to the special property of $k$ -submodular functions (pairwise monotonicity).

Armed with this property, they proposed a randomized offline algorithm for maximizing a non-monotone $k$ -submodular maximization without constraint. The algorithm is presented in Algorithm 2 in the appendix. This algorithm iterates through all $e\in V$ (in an arbitrary but fixed order) and selects the type $i\in[k]$ randomly according to some carefully designed probabilities $p_{i}$ for each $e\in V$ . The authors showed that this algorithm achieves a $1/2$ -approximation ratio for non-monotone $k$ -submodular maximization without constraints.

3.2 Robustness

We show that Algorithm 2 is $(\frac{1}{2},\delta,N)$ robust for a certain $\delta$ and $N$ . Due to that robustness we can use the framework proposed in Nie et al. (2023a), to transform Algorithm 2 into an online algorithm under bandit feedback achieving sublinear $1/2$ -regret.

Let $\boldsymbol{o}$ be an optimal solution with $\operatorname{supp}(\boldsymbol{o})=V$ (by Proposition 3.1 such a solution exists). Let $\boldsymbol{s}$ be the output of the algorithm with $|\operatorname{supp}(\boldsymbol{s})|=n$ using surrogate function $\hat{f}$ . We consider the $j$ -th iteration of the algorithm, and let $e^{(j)}$ be the element of $V$ considered in the $j$ -th iteration, $p_{i}^{(j)}$ be the probability that $i$ -th type is chosen in the $j$ -th iteration, and $\boldsymbol{s}^{(j)}$ be the solution after the $i$ -th iteration, where $\boldsymbol{s}^{(0)}=\mathbf{0}$ . Also for $0\leq j\leq n$ , let $\boldsymbol{o}^{(j)}$ be the element in $(k+1)^{V}$ obtained from $\boldsymbol{o}$ by replacing the coordinates on $\operatorname{supp}(\boldsymbol{s}^{(j)})$ with those of $\boldsymbol{s}^{(j)}$ , and for $1\leq j\leq n$ let $\boldsymbol{t}^{(j-1)}$ be the element in $(k+1)^{V}$ obtained from $\boldsymbol{o}^{(j)}$ by changing $\boldsymbol{o}^{(j)}(e^{(j)})$ with 0.

For $i\in[k]$ , let $y_{i}^{(j)}=\Delta_{e^{(j)},i}f(\boldsymbol{s}^{(j-1)})$ , $\hat{y}_{i}^{(j)}=\Delta_{e^{(j)},i}\hat{f}(\boldsymbol{s}^{(j-1)})$ and let $a_{i}^{(j)}=\Delta_{e^{(j)},i}f(\boldsymbol{t}^{(j-1)})$ , $\hat{a}_{i}^{(j)}=\Delta_{e^{(j)},i}\hat{f}(\boldsymbol{t}^{(j-1)})$ . Due to pairwise monotonicity, we have $y_{i}^{(j)}+y_{i^{\prime}}^{(j)}\geq 0$ and $a_{i}^{(j)}+a_{i^{\prime}}^{(j)}\geq 0$ for all $i,i^{\prime}\in[k]$ with $i\neq i^{\prime}$ , and thus, while the surrogate function $\hat{f}$ does not necessarily have pairwise monotonicity, we have

\hat{y}_{i}^{(j)}+\hat{y}_{i^{\prime}}^{(j)}\geq-4\varepsilon,\hat{a}_{i}^{(j)}+\hat{a}_{i^{\prime}}^{(j)}\geq-4\varepsilon.

(2)

for all $i,i^{\prime}\in[k]$ with $i\neq i^{\prime}$ since both inequalities involve a sum of four function values of $\hat{f}$ which can be off by at most $\epsilon$ with respect to $f$ . Also from $\boldsymbol{s}^{(j)}\preceq\boldsymbol{t}^{(j)}$ which holds by construction, orthant submodularity implies $y_{i}^{(j)}\geq a_{i}^{(j)}$ for all $i\in[k]$ and thus,

\displaystyle\hat{y}_{i}^{(j)}\geq\hat{a}_{i}^{(j)}-4\varepsilon.

(3)

for all $i\in[k]$ . W.l.o.g., we assume $f(\boldsymbol{0})=0$ .

Before showing the main result, we first show the following lemma, which is a generalization of Lemma 2.2 in Iwata et al. (2015) in the presence of noise.

Lemma 3.2.

Let $c,d\in\mathbb{R}_{+}$ . Conditioning on $\boldsymbol{s}^{(j-1)}$ , suppose that

\displaystyle\sum_{i=1}^{k}\left(\hat{a}_{i^{*}}^{(j)}-\hat{a}_{i}^{(j)}\right)p_{i}^{(j)}\leq c\left(\sum_{i=1}^{k}\hat{y}_{i}^{(j)}p_{i}^{(j)}\right)+d\varepsilon

(4)

holds for each $j$ with $1\leq j\leq n$ , where $i^{*}=\boldsymbol{o}(e^{(j)})$ . Then $\mathbb{E}[f(\boldsymbol{s})]\geq\frac{1}{1+c}f(\boldsymbol{o})-(2c+d+4)n\varepsilon$ .

We defer the proof of this lemma to Section A.1. Then, we show the following proposition:

Proposition 3.3.

Algorithm 2 for maximizing a non-monotone $k$ -submodular function is a $(\frac{1}{2},20n,nk)$ -robust approximation algorithm.

To show Proposition 3.3, by Definition 2.2, we aim to show that the output $\boldsymbol{s}$ of Algorithm 2 satisfies $\mathbb{E}[f(\boldsymbol{s})]\geq\frac{1}{2}f(\boldsymbol{o})-20n\varepsilon$ when the function evaluation is based on surrogate function $\hat{f}$ . By Lemma 3.2 it suffices to prove (4) for every $j\in[n]$ for $c=1$ and $d=14$ . For simplicity of the description, we shall omit the superscript $(j)$ if it is clear from the context. Due to limited space, here we only consider the case when $i^{+}\geq 3$ since this case needs more effort technically. The proof for other cases is in supplementary material Section A.2.

Part of proof of Proposition 3.3.

( $i^{+}\geq 3$ ) Our goal is to show

\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}\geq\hat{a}_{i^{*}}-14\varepsilon,

(5)

which is equivalent to (4) with $c=1$ and $d=14$ . By the ordering of $\{\hat{y}_{i}\}_{i\in[k]}$ and (3), for $i\leq i^{*}$ we have

\displaystyle\hat{y}_{i}\geq\hat{y}_{i^{*}}\geq\hat{a}_{i^{*}}-4\varepsilon.

(6)

Denote $r\in\operatorname*{arg\,min}_{i\in[k]}\hat{a}_{i}$ .

Case 1: If $r=i^{*}$ , we have $\sum_{i}\hat{a}_{i}p_{i}\geq\hat{a}_{i^{*}}(\sum_{i}p_{i})=\hat{a}_{i^{*}}$ . Since $\sum_{i}\hat{y}_{i}p_{i}\geq 0$ and since from Line 11 in Algorithm 2, a positive probability $p_{i}>0$ is only assigned to positive types $i$ with positive $\hat{y}_{i}>0$ values,

\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}\geq 0+\hat{a}_{i^{*}}\geq\hat{a}_{i^{*}}-14\varepsilon.

Thus (5) follows.

Case 2: If $r\neq i^{*}$ and $i^{*}\geq i^{+}$ , since from Line 11 in Algorithm 2, a positive probability $p_{i}>0$ is only assigned to positive $\hat{y}$ values, we have that $\sum_{i}\hat{y}_{i}p_{i}=\sum_{i\leq i^{+}}\hat{y}_{i}p_{i}\geq\sum_{i\leq i^{+}}(\hat{a}_{i^{*}}-4\varepsilon)p_{i}=\hat{a}_{i^{*}}-4\varepsilon$ by (6). When $\hat{a}_{r}\geq 0$ , since $\hat{a}_{i}\geq\hat{a}_{r}$ for any $i\neq r$ by definition of $r$ , we have $\sum_{i}\hat{a}_{i}p_{i}\geq 0$ . When $\hat{a}_{r}<0$ ,

$\displaystyle\sum_{i}\hat{a}_{i}p_{i}$	$\displaystyle\geq\sum_{i\neq r}(-\hat{a}_{r}-4\varepsilon)p_{i}+\hat{a}_{r}p_{r}$	(by (2))
	$\displaystyle=\hat{a}_{r}\left(p_{r}-\sum_{i\neq r}p_{i}\right)-4\varepsilon\sum_{i\neq r}p_{i}$
	$\displaystyle\geq 0-4\varepsilon,$

where the last inequality follows from $\hat{a}_{r}<0$ , $\sum_{i\neq r}p_{i}\geq p_{r}$ and $\sum_{i\neq r}p_{i}\leq 1$ , and $\sum_{i\neq r}p_{i}\geq p_{r}$ follows from Line 11 in Algorithm 2 that the largest single probability assigned is $\frac{1}{2}$ .

Thus,

\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}

\displaystyle\geq(\hat{a}_{i^{*}}-4\varepsilon)+(-4\varepsilon)\geq\hat{a}_{i^{*}}-14\varepsilon.

Therefore (5) holds.

Case 3: If $r\neq i^{*}$ and $i^{*}<i^{+}$ , we have

	$\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}=\sum_{i=1}^{k}\hat{y}_{i}p_{i}+\sum_{i=1}^{k}\hat{a}_{i}p_{i}$
	$\displaystyle\geq\sum_{i\leq i^{}}(\hat{a}_{i^{}}-4\varepsilon)p_{i}+\sum_{i>i^{*}}(\hat{a}_{i}-4\varepsilon)p_{i}+\sum_{i=1}^{k}\hat{a}_{i}p_{i}$		(using (6) and (3))
	$\displaystyle=\sum_{i\leq i^{}}\hat{a}_{i^{}}p_{i}+\sum_{i>i^{*}}\hat{a}_{i}p_{i}+\sum_{i=1}^{k}\hat{a}_{i}p_{i}-4\varepsilon$
	$\displaystyle=\left(\sum_{i<i^{}}\hat{a}_{i^{}}p_{i}+2\hat{a}_{i^{}}p_{i^{}}\right)$
	$\displaystyle\qquad+\left(\sum_{i>i^{}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}\right)-4\varepsilon.$		(7)

The first term equals $\hat{a}_{i^{*}}$ by simple calculations. Hence it suffices to show that the second term of (7) is greater than or equal to $-10\varepsilon$ .

Sub-case 3.a: First, we consider $\hat{a}_{r}\geq 0$ . By definition of $r$ , for all $i\in[k]$ , $\hat{a}_{i}\geq\hat{a}_{r}\geq 0$ . The second term of (7) can then be bounded

	$\displaystyle\sum_{i>i^{}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}$
	$\displaystyle\qquad\geq\hat{a}_{r}\left(p_{r}+\sum_{i>i^{}}p_{i}+\sum_{i\neq r,i^{}}p_{i}\right)\geq 0\geq-10\varepsilon.$		(8)

Sub-case 3.b: Second we consider $\hat{a}_{r}<0$ . Since $i^{*}<i^{+}$ , we have

	$\displaystyle\sum_{i>i^{}}p_{i}+\sum_{i\neq r,i^{}}p_{i}=\sum_{i^{}+1\leq i\leq i^{+}-1}p_{i}+p_{i^{+}}+\sum_{i\neq r,i^{}}p_{i}$
	$\displaystyle=\sum_{i=i^{}+1}^{i^{+}-1}\left(\frac{1}{2}\right)^{i}+\left(\frac{1}{2}\right)^{i^{+}-1}+\sum_{i\neq r,i^{}}p_{i}$		(by construction of $p_{i}$ ’s)
	$\displaystyle=\left(\frac{1}{2}\right)^{i^{}}+\sum_{i\neq r,i^{}}p_{i}$		(geometric sum)
	$\displaystyle=p_{i^{}}+\sum_{i\neq r,i^{}}p_{i}=1-p_{r}.$		(9)

Therefore, if $r<i^{*}$ , we get

	$\displaystyle\sum_{i>i^{}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}$
	$\displaystyle\geq\sum_{i>i^{}}(-\hat{a}_{r}-4\varepsilon)p_{i}+\sum_{i\neq r,i^{}}(-\hat{a}_{r}-4\varepsilon)p_{i}+\hat{a}_{r}p_{r}$		(using (2) )
	$\displaystyle=\hat{a}_{r}\left(p_{r}-\sum_{i>i^{}}p_{i}-\sum_{i\neq r,i^{}}p_{i}\right)-4\varepsilon\left(\sum_{i>i^{}}p_{i}+\sum_{i\neq r,i^{}}p_{i}\right)$
	$\displaystyle=\hat{a}_{r}\left(p_{r}-\left(1-p_{r}\right)\right)-4\varepsilon(1-p_{r})$		(using (9))
	$\displaystyle=\hat{a}_{r}\left(2p_{r}-1\right)-4\varepsilon(1-p_{r})$
	$\displaystyle\geq 0-4\varepsilon\geq-10\varepsilon.$		( $\hat{a}_{r}<0$ and $p_{r}\leq 1/2$ )

If $r>i^{*}$ . Then $p_{r}\leq 1/4$ by Line 11 in Algorithm 2 since $r\neq 1$ and $i^{+}\geq 3$ . Hence, by $\hat{a}_{r}<0$ , the second term of (7) can then be bounded as

	$\displaystyle\sum_{i>i^{}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}$
	$\displaystyle=\sum_{i>i^{},i\neq r}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+2\hat{a}_{r}p_{r}$
	$\displaystyle\geq\sum_{i>i^{},i\neq r}(-\hat{a}_{r}-4\varepsilon)p_{i}+\sum_{i\neq r,i^{}}(-\hat{a}_{r}-4\varepsilon)p_{i}+2\hat{a}_{r}p_{r}$		(using (2))
	$\displaystyle=\hat{a}_{r}\left(2p_{r}-\sum_{i>i^{},i\neq r}p_{i}-\sum_{i\neq r,i^{}}p_{i}\right)-4\varepsilon\left(\sum_{i>i^{},i\neq r}p_{i}+\sum_{i\neq r,i^{}}p_{i}\right)$
	$\displaystyle=\hat{a}_{r}(2p_{r}-(1-2p_{r}))-4\varepsilon(1-2p_{r})$		(using (9))
	$\displaystyle\geq 0-4\varepsilon\geq-10\varepsilon.$		( $\hat{a}_{r}<0$ and $p_{r}\leq 1/4$ )

Thus we conclude that when $\hat{a}_{r}<0$ , we have

\displaystyle\sum_{i>i^{*}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{*}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}\geq-10\varepsilon.

(10)

Combining (10) and (8), we conclude the proof for $i^{+}\geq 3$ . ∎

Remark 3.4.

The assumption of bounded noise plays an essential role in the analysis. Although in the presence of noise, the original property (e.g., pairwise monotonicity) of the function does not hold, with bounded noise bridging the $\hat{f}$ and $f$ , we can still have a relaxed version of the desired property, leading to similar approximation ratio with only small error. In many cases, showing the relaxed version require completely new steps.

3.3 Regret Bound

Once we have analysed the robustness of Algorithm 2 and identified the number of function evaluations of Algorithm 2 is exactly $nk$ , we can employ the framework proposed in Nie et al. (2023a) and obtain the following result:

Corollary 3.5.

For an online non-monotone unconstrained $k$ -submodular maximization problem, the expected cumulative $1/2$ -regret of C-ETC using Algorithm 2 as a sub-routine is at most $\mathcal{O}\left(nk^{\frac{1}{3}}T^{\frac{2}{3}}\log(T)^{\frac{1}{3}}\right)$ given $T\geq nk$ .

4 Monotone Functions without Constraints

In this section, we continue to explore the unconstrained case, but now the objective is monotone. We adopt the offline $\frac{k}{2k-1}$ -approximation algorithm proposed in (Iwata et al., 2015). The pseudo-code is presented in Algorithm 3 in the appendix. Similar to Algorithm 2 for the non-monotone case, the algorithm iterates through all $e\in V$ (in an arbitrary but fixed order) and selects the type $i\in[k]$ randomly according to some carefully designed probabilities for each $e\in V$ .

One important note to highlight is the modification made on line 6 of the algorithm. In the original algorithm from (Iwata et al., 2015), it was stated as $\beta\leftarrow\sum_{i=1}^{k}y_{i}^{t}$ . This particular step is not robust to noise since later in line 9, the probability of $y_{i}^{t}/\beta$ is assigned to type $i$ . The probability becomes meaningless if $\beta<0$ and this is possible due to noise. We identified that a sufficient modification is to change this step to $\beta\leftarrow\sum_{i=1}^{k}[y_{i}]_{+}^{t}$ . Here, $[x]_{+}=x$ if $x\geq 0$ , and 0 otherwise. Note that when there is no noise, Algorithm 3 reduces to the original algorithm in Iwata et al. (2015) since in the monotone case, $y_{i}\geq 0$ always holds.

We show the following robustness guarantee of Algorithm 3:

Proposition 4.1.

Algorithm 3 for maximizing a monotone $k$ -submodular function is a $(\frac{k}{2k-1},(16-\frac{2}{k})n,nk)$ -robust approximation.

By Definition 2.2, we aim to show that the output $\boldsymbol{s}$ of Algorithm 3 satisfies $\mathbb{E}[f(\boldsymbol{s})]\geq\frac{k}{2k-1}f(\boldsymbol{o})-(16-\frac{2}{k})n\varepsilon$ when the function evaluation is based on surrogate function $\hat{f}$ . Note that in this setting, Lemma 3.2 still holds. Thus, it is sufficient to show

\displaystyle\sum_{i=1}^{k}(\hat{a}_{i^{*}}-\hat{a}_{i})p_{i}\leq(1-\frac{1}{k})\sum_{i=1}^{k}\hat{y}_{i}p_{i}+10\varepsilon,

(11)

which is equivalent to (4) with $c=1-\frac{1}{k}$ and $d=10$ . The detailed proof is in Section A.3.

Remark 4.2.

This modification of $\beta$ becomes significant when dealing with a noisy function oracle $\hat{f}$ , where it’s possible that $\hat{y}_{i}<0$ . Our analysis demonstrates that this modification is crucial in handling the case of $\beta=0$ , which is a trivial case in the noiseless analysis of Iwata et al. (2015). In the presence of noise, the case of $\beta=0$ becomes non trivial. It also plays a pivotal role in another case ( $\beta>0$ ), where we utilized $\beta-[\hat{y}_{i^{*}}]_{+}^{t}\leq\beta$ . In the original algorithm, $\beta-\hat{y}_{i^{*}}^{t}\leq\beta$ does not necessarily hold due to noise.

With the robustness guarantee for Algorithm 3, we can transfer it to the online bandit setting using the framework proposed in Nie et al. (2023a):

Corollary 4.3.

For an online monotone unconstrained $k$ -submodular maximization problem, the expected cumulative $\frac{k}{2k-1}$ -regret of C-ETC is at most $\mathcal{O}\left(nk^{\frac{1}{3}}T^{\frac{2}{3}}\log(T)^{\frac{1}{3}}\right)$ given $T\geq\max\{nk,\frac{2\sqrt{2}k}{16-\frac{2}{k}}\}$ .

5 Monotone Functions with Individual Size Constraints

In this section, we consider Individual Size (IS) constraint. In IS, each type $i$ has a limit $B_{i}$ on the maximum number of pairs of that type $i$ , with $B=\sum_{i}B_{i}$ as the total budget. We consider the offline greedy algorithm proposed in Ohsaka & Yoshida (2015).

The pseudo-code is presented in Algorithm 4 in the appendix. The algorithm builds the solution greedily by iteratively including the element-type pair that yields the largest marginal gain, as long as that pair is still available and the individual budget is not exhausted. We show the following robustness guarantee of Algorithm 4:

Proposition 5.1.

Algorithm 4 for maximizing a monotone $k$ -submodular function under individual size constraints is a $(\frac{1}{3},\frac{4}{3}(B+1),nkB)$ -robust approximation, where $B$ is the total size limit.

The proof is in Section A.4. With the robustness guarantee for Algorithm 4, we can transfer it to the online bandit setting using the framework proposed in Nie et al. (2023a):

Corollary 5.2.

For an online monotone $k$ -submodular maximization problem under individual size constraints, the expected cumulative $1/3$ -regret of C-ETC is at most $\mathcal{O}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}BT^{\frac{2}{3}}\log(T)^{\frac{1}{3}}\right)$ given $T\geq nk\max\{1,\frac{3\sqrt{2}B}{2(B+1)}\}$ .

6 Monotone Functions with Matroid Constraints

In this section, we consider matroid constraint with monotone objective functions. We adapt the offline algorithm proposed in Sakaue (2017), which is depicted in Algorithm 5 in the appendix. The algorithm still builds the solution in a greedy manner. At each iteration, the algorithm constructs available elements $E(\boldsymbol{s})$ given the current solution $\boldsymbol{s}$ using an assumed independence oracle, and includes the element-type pair that yields the largest marginal gain. We show the following robustness guarantee of Algorithm 5:

Proposition 6.1.

Algorithm 5 for maximizing a monotone $k$ -submodular function under a matroid constraint is a $(\frac{1}{2},M+1,nkM)$ -robust approximation, where $M$ is the rank of the matroid.

The proof is in Section A.5. With the robustness guarantee for Algorithm 5, we transform it into a CMAB algorithm using the framework proposed in (Nie et al., 2023a):

Corollary 6.2.

For an online monotone $k$ -submodular maximization problem under a matroid constraint, the expected cumulative $1/2$ -regret of C-ETC is at most $\mathcal{O}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}MT^{\frac{2}{3}}\log(T)^{\frac{1}{3}}\right)$ given $T\geq nk\max\{1,\frac{3\sqrt{2}M}{2(M+1)}\}$ .

One special case of matroid constraint is Total Size (TS) constraint (Ohsaka & Yoshida, 2015). In TS, there is a limited budget on the total number of element-type pairs that can be selected. This is also called uniform matroid in literature. In this case, the rank of the matroid is the total size budget $B$ . We can immediately get an online algorithm as a result of Corollary 6.2:

Corollary 6.3.

For an online monotone $k$ -submodular maximization problem under a TS constraint, the expected cumulative $1/2$ -regret of C-ETC is at most $\mathcal{O}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}BT^{\frac{2}{3}}\log(T)^{\frac{1}{3}}\right)$ given $T\geq nk\max\{1,\frac{3\sqrt{2}B}{2(B+1)}\}$ , using $N_{\mathcal{A}}=nkB$ as an upper bound of the number of function evaluations for Algorithm 5.

Remark 6.4.

In Sections 5 and 6, although marginal gains are not guaranteed to be non-negative, we prove that there is no need to modify the algorithms as in Section 4. This is due to the inherent design of the algorithms. In Algorithm 3, probabilities are assigned to each item in proportion to their marginal gains for each type (Line 9), so it is necessary to ensure that the sum of the marginal gains is non-negative. However, Algorithms 4 and 5 select item-type pairs by directly comparing marginal gains, regardless of whether they are positive or negative. In cases where monotonicity does not hold due to noise, we use bounded error analysis to derive analogous properties.

7 Non-Monotone Functions with Matroid Constraints

In Sun et al. (2022), it is shown that Algorithm 5 in appendix can achieve $1/3$ approximation ratio even the objective function is non-monotone. We show the following robustness result when Algorithm 5 is fed with surrogate function $\hat{f}$ .

Proposition 7.1.

Algorithm 5 for maximizing a non-monotone $k$ -submodular function under a matroid constraint is a $(\frac{1}{3},\frac{4}{3}(M+1),nkM)$ -robust approximation, where $M$ is the rank of the matroid.

The proof is in Section A.6. We use the offline-to-online transformation framework proposed in Nie et al. (2023a) to adapt Algorithm 5:

Corollary 7.2.

For an online non-monotone $k$ -submodular maximization problem under a matroid constraint, the expected cumulative $1/3$ -regret of C-ETC is at most $\mathcal{O}\left(n^{\frac{1}{3}}k^{\frac{1}{3}}MT^{\frac{2}{3}}\log(T)^{\frac{1}{3}}\right)$ given $T\geq nk\max\{1,\frac{3\sqrt{2}M}{2(M+1)}\}$ .

8 Evaluations

In this section, we empirically evaluate the performance of our proposed methods in the context of online influence maximization with $k$ topics. This problem is formulated as a monotone $k$ -submodular bandit problem (detailed below). As there are no directly comparable baselines in the literature, we select naive UCB (NaiveUCB) and random selection (Random) as benchmarks, with UCB implemented using the standard UCB1 algorithm.

Online Influence Maximization with $k$ Topics: The problem involves a social network represented as a directed graph $G=(V,E)$ , where $V$ is the set of nodes and $E$ is the set of edges. Each edge $(u,v)\in E$ has associated weights $p_{u,v}^{i}$ , $i\in[k]$ , where $p_{u,v}^{i}$ denotes the influence strength from user $u$ to $v$ on topic $i$ . The goal is to maximize the number of users influenced by one or more topics. We adopt the $k$ -topic independent cascade ( $k$ -IC) model from Ohsaka & Yoshida (2015), which generalizes the independent cascade model (Kempe et al., 2003).

Specifically, the influence spread $\sigma:(k+1)^{V}\rightarrow\mathbb{R}_{+}$ in the $k$ -IC model is defined as $\sigma(S)=\mathbb{E}\left[\left|\bigcup_{i\in[k]}A_{i}\left(U_{i}(S)\right)\right|\right]$ , where $A_{i}\left(U_{i}(S)\right)$ is a random variable representing the set of influenced nodes in the diffusion process of topic $i$ . As shown in Ohsaka & Yoshida (2015), $\sigma$ is monotone $k$ -submodular. Given a directed graph $G=(V,E)$ , edge probabilities $\{p_{u,v}^{i}\mid(u,v)\in E,i\in[k]\}$ , the task is to select a seed set $S\in(k+1)^{V}$ that maximizes $\sigma(S)$ , subject to some constraints (e.g., total size constraints).

Experiment settings: We use the ego-facebook network (Leskovec & Mcauley, 2012). After applying a community detection algorithm (Blondel et al., 2008) to extract a subgraph, we convert it into a directed graph consisting of 350 users and 2,845 edges. Three sets of edge weights, corresponding to three topics ( $k=3$ ), are generated: Uniform(0, 0.2), Normal(0.1, 0.05), and Exponential(10), with the Normal and Exponential distributions truncated to [0, 1]. We evaluate our algorithms under both TS and IS constraints. The TS budget is set to $B=6$ , while for IS constraints, the budgets for the three topics are all 2. As greedy algorithms offer $1/2$ and $1/3$ approximations for TS and IS respectively (Ohsaka & Yoshida, 2015), we use an offline greedy algorithm to compute the $\alpha$ -optimal value, with 100 diffusion simulations approximating the true function value. Since all cases are monotone, NaiveUCB and Random restrict their search space to actions that exhaust the budget. Means and standard deviations are calculated over 10 independent runs.

Results: The instantaneous rewards over $10^{4}$ time steps is shown in Figure 1. Under both TS and IS constraints, the results can be summarized as follows. In the early stages (before 4,000), ETCG explores suboptimal actions that do not use the full budget, leading to worse initial instantaneous rewards. However, ETCG catches up in later stages and achieves lower cumulative regret over the entire time horizon. We note that NaiveUCB does UCB on all actions that exhaust budgets, which takes approximately ${60\choose 6}\sim 5\times 10^{8}$ for TS and ${20\choose 2}^{3}\sim 7\times 10^{7}$ for IS, respectively, to even explore each action once, resulting a bad performance.

Refer to caption — Figure 1: Instantaneous Rewards on Influence Maximization experiments.

9 Conclusion

In this paper, we investigate the problem of online $k$ -submodular maximization problem under bandit feedback. Utilizing the framework proposed by Nie et al. (2023a), we propose CMAB algorithms through adapting various offline algorithms and analyzing their robustness. We obtain sublinear regret for online $k$ -submodular maximization under bandit feedback under various settings, including monotone and non-monotone objectives without constraint, monotone objectives with individual size constraint, monotone and non-monotone objectives with matroid constraint. Numerical experiments on online influence maximization with $k$ topics were conducted to evaluate the effectiveness of our methods.

References

Blackwell (1956) Blackwell, D. An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics, 6:1–8, 1956.
Blondel et al. (2008) Blondel, V. D., Guillaume, J.-L., Lambiotte, R., and Lefebvre, E. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008:10008, 2008.
Chen et al. (2022) Chen, J., Tang, Z., and Wang, C. Monotone k-submodular knapsack maximization: An analysis of the greedy+singleton algorithm. In Ni, Q. and Wu, W. (eds.), Algorithmic Aspects in Information and Management, pp. 144–155, Cham, 2022. Springer International Publishing. ISBN 978-3-031-16081-3.
Ene & Nguyen (2022) Ene, A. and Nguyen, H. Streaming algorithm for monotone k-submodular maximization with cardinality constraints. In International Conference on Machine Learning, pp. 5944–5967. PMLR, 2022.
Fourati et al. (2023) Fourati, F., Aggarwal, V., Quinn, C., and Alouini, M.-S. Randomized greedy learning for non-monotone stochastic submodular maximization under full-bandit feedback. In International Conference on Artificial Intelligence and Statistics, pp. 7455–7471. PMLR, 2023.
Fourati et al. (2024a) Fourati, F., Alouini, M.-S., and Aggarwal, V. Federated combinatorial multi-agent multi-armed bandits. In Forty-first International Conference on Machine Learning, 2024a.
Fourati et al. (2024b) Fourati, F., Quinn, C. J., Alouini, M.-S., and Aggarwal, V. Combinatorial stochastic-greedy bandit. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pp. 12052–12060, 2024b.
Huber & Kolmogorov (2012) Huber, A. and Kolmogorov, V. Towards minimizing k-submodular functions. In International Symposium on Combinatorial Optimization, 2012.
Iwata et al. (2013) Iwata, S., ichi Tanigawa, S., and Yoshida, Y. Bisubmodular function maximization and extensions. Technical Report METR 2013–16, the University of Tokyo, September 2013.
Iwata et al. (2015) Iwata, S., Tanigawa, S.-i., and Yoshida, Y. Improved approximation algorithms for k-submodular function maximization. In ACM-SIAM Symposium on Discrete Algorithms, 2015.
Kempe et al. (2003) Kempe, D., Kleinberg, J., and Tardos, É. Maximizing the spread of influence through a social network. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 137–146, 2003.
Khuller et al. (1999) Khuller, S., Moss, A., and Naor, J. S. The budgeted maximum coverage problem. Information Processing Letters, 70(1):39–45, 1999.
Leskovec & Mcauley (2012) Leskovec, J. and Mcauley, J. Learning to discover social circles in ego networks. volume 25, 2012.
Lin et al. (2015) Lin, T., Li, J., and Chen, W. Stochastic online greedy learning with semi-bandit feedbacks. In Proceedings of the 29th International Conference on Neural Information Processing Systems, pp. 352–360, 2015.
Matsuoka & Ohsaka (2021) Matsuoka, T. and Ohsaka, N. Maximization of monotone $k$ -submodular functions with bounded curvature and non- $k$ -submodular functions. In Asian Conference on Machine Learning, pp. 1707–1722. PMLR, 2021.
Niazadeh et al. (2021) Niazadeh, R., Golrezaei, N., Wang, J. R., Susan, F., and Badanidiyuru, A. Online learning via offline greedy algorithms: Applications in market design and optimization. In Proceedings of the 22nd ACM Conference on Economics and Computation, pp. 737–738, 2021.
Nie et al. (2022) Nie, G., Agarwal, M., Umrawal, A. K., Aggarwal, V., and Quinn, C. J. An explore-then-commit algorithm for submodular maximization under full-bandit feedback. In Uncertainty in Artificial Intelligence, pp. 1541–1551. PMLR, 2022.
Nie et al. (2023a) Nie, G., Nadew, Y. Y., Zhu, Y., Aggarwal, V., and Quinn, C. J. A framework for adapting offline algorithms to solve combinatorial multi-armed bandit problems with bandit feedback. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp. 26166–26198. PMLR, 23–29 Jul 2023a.
Nie et al. (2023b) Nie, G., Zhu, Y., Nadew, Y. Y., Basu, S., Pavan, A., and Quinn, C. J. Size-constrained k-submodular maximization in near-linear time. In Evans, R. J. and Shpitser, I. (eds.), Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, volume 216 of Proceedings of Machine Learning Research, pp. 1545–1554. PMLR, 31 Jul–04 Aug 2023b.
Ohsaka & Yoshida (2015) Ohsaka, N. and Yoshida, Y. Monotone k-submodular function maximization with size constraints. In Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015.
Oshima (2021) Oshima, H. Improved randomized algorithm for k-submodular function maximization. SIAM Journal on Discrete Mathematics, 35(1):1–22, 2021.
Pham et al. (2021) Pham, C. V., Vu, Q. C., Ha, D. K., and Nguyen, T. T. Streaming algorithms for budgeted k-submodular maximization problem. In Computational Data and Social Networks: 10th International Conference, Proceedings, pp. 27–38. Springer, 2021.
Sakaue (2017) Sakaue, S. On maximizing a monotone k-submodular function subject to a matroid constraint. Discrete Optimization, 23:105–113, 2017. ISSN 1572-5286.
Soma (2019) Soma, T. No-regret algorithms for online $k$ -submodular maximization. In Chaudhuri, K. and Sugiyama, M. (eds.), Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, pp. 1205–1214. PMLR, 16–18 Apr 2019.
Streeter & Golovin (2008) Streeter, M. and Golovin, D. An online algorithm for maximizing submodular functions. In Proceedings of the 21st International Conference on Neural Information Processing Systems, pp. 1577–1584, 2008.
Sun et al. (2022) Sun, Y., Liu, Y., and Li, M. Maximization of $k$ -submodular function with a matroid constraint. In Du, D.-Z., Du, D., Wu, C., and Xu, D. (eds.), Theory and Applications of Models of Computation, pp. 1–10, Cham, 2022. Springer International Publishing.
Sviridenko (2004) Sviridenko, M. A note on maximizing a submodular set function subject to a knapsack constraint. Operations Research Letters, 32:41–43, 2004.
Tajdini et al. (2023) Tajdini, A., Jain, L., and Jamieson, K. Minimax optimal submodular optimization with bandit feedback, 2023.
Takemori et al. (2020) Takemori, S., Sato, M., Sonoda, T., Singh, J., and Ohkuma, T. Submodular bandit problem under multiple constraints. In Conference on Uncertainty in Artificial Intelligence, pp. 191–200. PMLR, 2020.
Tang et al. (2022) Tang, Z., Wang, C., and Chan, H. On maximizing a monotone k-submodular function under a knapsack constraint. Operations Research Letters, 50:28–31, 2022.
Ward & Živný (2014) Ward, J. and Živný, S. Maximizing k-submodular functions and beyond. ACM Transactions on Algorithms, 12:1 – 26, 2014.
Xiao et al. (2023) Xiao, H., Liu, Q., Zhou, Y., and Li, M. Approximation algorithms for $k$ -submodular maximization subject to a knapsack constraint, 2023.
Yu et al. (2016) Yu, B., Fang, M., and Tao, D. Linear submodular bandits with a knapsack constraint. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 30, 2016.
Yu et al. (2023) Yu, K., Li, M., Zhou, Y., and Liu, Q. On maximizing monotone or non-monotone k-submodular functions with the intersection of knapsack and matroid constraints. J. Comb. Optim., 45(3), apr 2023. ISSN 1382-6905.
Yue & Guestrin (2011) Yue, Y. and Guestrin, C. Linear submodular bandits and their application to diversified retrieval. Advances in Neural Information Processing Systems, 24, 2011.

Appendix A Missing Proofs

A.1 Proof of Lemma 3.2

We restate the lemma here:

Lemma A.1.

Let $c,d\in\mathbb{R}_{+}$ . Conditioning on $\boldsymbol{s}^{(j-1)}$ , suppose that

\displaystyle\sum_{i=1}^{k}\left(\hat{a}_{i^{*}}^{(j)}-\hat{a}_{i}^{(j)}\right)p_{i}^{(j)}\leq c\left(\sum_{i=1}^{k}\hat{y}_{i}^{(j)}p_{i}^{(j)}\right)+d\varepsilon

(12)

holds for each $j$ with $1\leq j\leq n$ , where $i^{*}=\boldsymbol{o}(e^{(j)})$ . Then $\mathbb{E}[f(\boldsymbol{s})]\geq\frac{1}{1+c}f(\boldsymbol{o})-(2c+d+4)n\varepsilon$ .

Proof.

Fix $j\in[n]$ . Conditioning on $\boldsymbol{s}^{(j-1)}$ , element $e^{(j)}$ will be randomly assigned a type $i\in[k]$ following probabilities described in Lines 9-11 in the pseudocode. We first calculate the conditional expected difference between $f(\boldsymbol{o}^{(j-1)})$ and $f(\boldsymbol{o}^{(j)})$ . Let $i^{*}$ denote the type of $e^{(j)}$ in $\boldsymbol{o}$ (and thus in $\boldsymbol{o}^{(j-1)}$ ). By construction, since $\boldsymbol{t}^{(j-1)}$ is obtained from $\boldsymbol{o}^{(j)}$ by changing $\boldsymbol{o}^{(j)}(e^{(j)})$ (which is the same as $\boldsymbol{s}(e^{(j)})$ and thus is $i$ with probability $p_{i}^{(j)}$ ) with 0, we have

\displaystyle\mathbb{E}[f(\boldsymbol{o}^{(j)})-f(\boldsymbol{t}^{(j-1)})|\boldsymbol{s}^{(j-1)}]=\sum_{i=1}^{k}\Delta_{e^{(j)},i}f(\boldsymbol{t}^{(j-1)})p_{i}^{(j)}=\sum_{i=1}^{k}a_{i}^{(j)}p_{i}^{(j)}.

(13)

Also, $\boldsymbol{o}^{(j-1)}$ can be obtained from $\boldsymbol{t}^{(j-1)}$ by replacing $\boldsymbol{t}^{(j-1)}(e^{(j)})$ (which is 0) with $\boldsymbol{o}^{(j)}(e^{(j)})$ (which is $i^{*}$ ), we have

\displaystyle f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{t}^{(j-1)})=a_{i^{*}}^{(j)}.

(14)

Thus

	$\displaystyle\mathbb{E}[f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})\|\boldsymbol{s}^{(j-1)}]$
$\displaystyle=$	$\displaystyle\mathbb{E}[f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{t}^{(j-1)})+f(\boldsymbol{t}^{(j-1)})-f(\boldsymbol{o}^{(j)})\|\boldsymbol{s}^{(j-1)}]$
$\displaystyle=$	$\displaystyle\sum_{i=1}^{k}\left(a_{i^{*}}^{(j)}-a_{i}^{(j)}\right)p_{i}^{(j)}$	(using (13) and (14))
$\displaystyle\leq$	$\displaystyle\sum_{i=1}^{k}\left(\hat{a}_{i^{*}}^{(j)}-\hat{a}_{i}^{(j)}+4\varepsilon\right)p_{i}^{(j)}$	(definition of $\hat{f}$ )
$\displaystyle=$	$\displaystyle\sum_{i=1}^{k}\left(\hat{a}_{i^{*}}^{(j)}-\hat{a}_{i}^{(j)}\right)p_{i}^{(j)}+4\varepsilon$	(15)

where the last equality is due to $\sum_{i}p_{i}^{(j)}=1$ . $\boldsymbol{s}^{(j)}$ can be obtained from $\boldsymbol{s}^{(j-1)}$ by replacing $\boldsymbol{s}^{(j-1)}(e^{(j)})$ (which is 0) with $\boldsymbol{s}^{(j)}(e^{(j)})$ (which is $i$ with probability $p_{i}^{(j)}$ ), we have

	$\displaystyle\mathbb{E}[f(\boldsymbol{s}^{(j)})-f(\boldsymbol{s}^{(j-1)})\|\boldsymbol{s}^{(j-1)}]$
$\displaystyle=$	$\displaystyle\sum_{i=1}^{k}y_{i}^{(j)}p_{i}^{(j)}$	(by construction)
$\displaystyle\geq$	$\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}^{(j)}-2\varepsilon)p_{i}^{(j)}$	(definition of $\hat{f}$ )
$\displaystyle=$	$\displaystyle\sum_{i=1}^{k}\hat{y}_{i}^{(j)}p_{i}^{(j)}-2\varepsilon,$	(16)

where both expectations are taken over the randomness of the algorithm. Combining (15), (16) and (12), we have

\displaystyle\mathbb{E}[f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})|\boldsymbol{s}^{(j-1)}]\leq c\mathbb{E}[f(\boldsymbol{s}^{(j)})-f(\boldsymbol{s}^{(j-1)})|\boldsymbol{s}^{(j-1)}]+(2c+d+4)\varepsilon.

(17)

Note that $\boldsymbol{o}^{(0)}=\boldsymbol{o}$ and $\boldsymbol{o}^{(n)}=\boldsymbol{s}$ by construction. Hence

$\displaystyle f(\boldsymbol{o})-\mathbb{E}[f(\boldsymbol{s})]$	$\displaystyle=\sum_{j=1}^{n}\mathbb{E}[f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})]$	(by construction)
	$\displaystyle=\sum_{j=1}^{n}\mathbb{E}[\mathbb{E}[f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})\|\boldsymbol{s}^{(j-1)}]]$
	$\displaystyle=\sum_{j=1}^{n}\mathbb{E}\left[c\mathbb{E}[f(\boldsymbol{s}^{(j)})-f(\boldsymbol{s}^{(j-1)})\|\boldsymbol{s}^{(j-1)}]+(2c+d+4)n\varepsilon]\right]$	(using (17))
	$\displaystyle\leq c\left(\sum_{j=1}^{n}\mathbb{E}[f(\boldsymbol{s}^{(j)})-f(\boldsymbol{s}^{(j-1)})]\right)+(2c+d+4)n\varepsilon$
	$\displaystyle=c(\mathbb{E}[f(\boldsymbol{s})]-f(\mathbf{0}))+(2c+d+4)n\varepsilon$
	$\displaystyle=c\mathbb{E}[f(\boldsymbol{s})]+(2c+d+4)n\varepsilon,$	( $f(\boldsymbol{0})=0$ )

and we get the statement. ∎

Algorithm 2 A randomized algorithm for

k

-submodular maximization without constraints (Iwata et al., 2015)

1: Input: ground set

V

, value oracle access for

f

2: Output: a vector

\boldsymbol{s}\in(k+1)^{V}

3: Initialize

\boldsymbol{s}\leftarrow\mathbf{0}_{n}

4: for

e\in V

y_{i}\leftarrow\Delta_{e,i}f(\boldsymbol{s})

for

i\in[k]

6: w.l.o.g. assume

y_{1}\geq y_{2}\geq\ldots\geq y_{k}

i^{+}\leftarrow\begin{cases}\text{max. }i\text{ such that }y_{i}>0&\text{if }y_{1}>0,\\ 0&\text{otherwise.}\end{cases}

8: Initialize

p\leftarrow\mathbf{0}_{k}

9: if

i^{+}\leq 1

p_{1}\leftarrow 1

10: else if

i^{+}=2

, for

i\in\{1,2\}

p_{i}\leftarrow\frac{y_{i}}{(y_{1}+y_{2})}

11: else

p_{i^{+}}\leftarrow\left(\frac{1}{2}\right)^{i^{+}-1}

and for

i<i^{+}

p_{i}\leftarrow\left(\frac{1}{2}\right)^{i}

12: Choose

\boldsymbol{s}(e)\sim\mathbb{P}(\boldsymbol{s}(e)=i)=p_{i}

for all

i\in[k]

13: end for

14: Return

\boldsymbol{s}

A.2 Proof of Proposition 3.3

We repeat some notations that we will use in our analysis that have already been introduced in the main text.

Let $\boldsymbol{o}$ be an optimal solution with $\operatorname{supp}(\boldsymbol{o})=V$ (by Proposition 3.1 such a solution exists). Let $\boldsymbol{s}$ be the output of the algorithm with $|\operatorname{supp}(\boldsymbol{s})|=n$ using surrogate function $\hat{f}$ . We consider the $j$ -th iteration of the algorithm, and let $e^{(j)}$ be the element of $V$ considered in the $j$ -th iteration, $p_{i}^{(j)}$ be the probability that $i$ -th type is chosen in the $j$ -th iteration, and $\boldsymbol{s}^{(j)}$ be the solution after the $i$ -th iteration, where $\boldsymbol{s}^{(0)}=\mathbf{0}$ . Also for $0\leq j\leq n$ , let $\boldsymbol{o}^{(j)}$ be the element in $\{0,1,\ldots,k\}^{V}$ obtained from $\boldsymbol{o}$ by replacing the coordinates on $\operatorname{supp}(\boldsymbol{s}^{(j)})$ with those of $\boldsymbol{s}^{(j)}$ , and for $1\leq j\leq n$ let $\boldsymbol{t}^{(j-1)}$ be the element in $\{0,1,\ldots,k\}^{V}$ obtained from $\boldsymbol{o}^{(j)}$ by changing $\boldsymbol{o}^{(j)}(e^{(j)})$ with 0.

The subsequent analyses will involve comparing different marginal gains, which we next introduce. For $i\in[k]$ , let $y_{i}^{(j)}=\Delta_{e^{(j)},i}f(\boldsymbol{s}^{(j-1)})$ , $\hat{y}_{i}^{(j)}=\Delta_{e^{(j)},i}\hat{f}(\boldsymbol{s}^{(j-1)})$ and let $a_{i}^{(j)}=\Delta_{e^{(j)},i}f(\boldsymbol{t}^{(j-1)})$ , $\hat{a}_{i}^{(j)}=\Delta_{e^{(j)},i}\hat{f}(\boldsymbol{t}^{(j-1)})$ . Due to pairwise monotonicity, we have $y_{i}^{(j)}+y_{i^{\prime}}^{(j)}\geq 0$ and $a_{i}^{(j)}+a_{i^{\prime}}^{(j)}\geq 0$ for all $i,i^{\prime}\in[k]$ with $i\neq i^{\prime}$ , and thus, while the surrogate function $\hat{f}$ does not necessarily have pairwise monotonicity, we have

	$\displaystyle\hat{y}_{i}^{(j)}+\hat{y}_{i^{\prime}}^{(j)}\geq-4\varepsilon$	,		(18)
	$\displaystyle\hat{a}_{i}^{(j)}+\hat{a}_{i^{\prime}}^{(j)}\geq-4\varepsilon$	.		(19)

\displaystyle\hat{y}_{i}^{(j)}\geq\hat{a}_{i}^{(j)}-4\varepsilon.

(20)

for all $i\in[k]$ . Without loss of generality, we assume $f(\boldsymbol{0})=0$ .

Now we are ready to prove Proposition 3.3.

Proof of Proposition 3.3.

It is trivial to see the number of value oracle queries required is $nk$ , as it iterates through all item-type pairs. The following proof will be dedicated to show the $\alpha$ and $\delta$ terms in Definition 2.2.

By Definition 2.2, we aim to show that the output $\boldsymbol{s}$ of Algorithm 2 satisfies $\mathbb{E}[f(\boldsymbol{s})]\geq\frac{1}{2}f(\boldsymbol{o})-20n\varepsilon$ when the function evaluation is based on surrogate function $\hat{f}$ . By Lemma 3.2, our goal is to show

\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}\geq\hat{a}_{i^{*}}-14\varepsilon,

(21)

which is equivalent to (4) in main text with $c=1$ and $d=14$ . For simplicity of the description we shall omit the superscript $(j)$ if it is clear from the context. Recall that $\hat{y}_{i}+\hat{y}_{i^{\prime}}\geq-4\varepsilon$ and $\hat{a}_{i}+\hat{a}_{i^{\prime}}\geq-4\varepsilon$ for $i,i^{\prime}\in[k]$ with $i\neq i^{\prime}$ , and $\hat{y}_{i}\geq\hat{a}_{i}-4\varepsilon$ for $i\in[k]$ (c.f. (18),(19) and (20)). We break up the analysis to three cases: $i^{+}\leq 1$ , $i^{+}=2$ and $i^{+}\geq 3$ .

Case 1:

If $i^{+}\leq 1$ , then by Line 9 in Algorithm 2 $p_{1}=1$ and $p_{i}=0$ for $i>1$ . So (21) specializes to $\hat{a}_{1}+\hat{y}_{1}\geq\hat{a}_{i^{*}}-14\varepsilon$ . Since $\hat{y}_{i}+\hat{y}_{i^{\prime}}\geq-4\varepsilon$ for $i,i^{\prime}\in[k]$ and $\hat{y}^{\prime}s$ are ranked in decreasing order, we have $\hat{y}_{1}+\hat{y}_{i^{\prime}}\geq-4\varepsilon$ and $\hat{y}_{1}\geq\hat{y}_{i^{\prime}}$ for $i^{\prime}\in[k]$ . Summing these two inequalities we have $\hat{y}_{1}\geq-2\varepsilon$ .

Sub-case 1.a:

If $i^{*}=1$ ,

\hat{a}_{1}+\hat{y}_{1}=\hat{a}_{i^{*}}+\hat{y}_{1}\geq\hat{a}_{i^{*}}-2\varepsilon\geq\hat{a}_{i^{*}}-14\varepsilon,

showing the statement.

Sub-case 1.b:

If $i^{*}\neq 1$ , from $i^{+}\leq 1$ and (20), $0\geq\hat{y}_{i^{*}}\geq\hat{a}_{i^{*}}-4\varepsilon$ . We also have $\hat{a}_{1}+\hat{a}_{i^{*}}\geq-4\varepsilon$ by (19). Hence by summing these two inequalities we have $\hat{a}_{1}\geq-8\varepsilon$ . Combining with $\hat{y}_{1}\geq-2\varepsilon$ we have

\hat{a}_{1}+\hat{y}_{1}\geq-10\varepsilon\geq\hat{a}_{i^{*}}-14\varepsilon,

showing the statement.

Case 2:

If $i^{+}=2$ , , then by Line 10 in Algorithm 2, (21) specializes to

(\hat{a}_{1}+\hat{y}_{1})\hat{y}_{1}+(\hat{a}_{2}+\hat{y}_{2})\hat{y}_{2}\geq\hat{a}_{i^{*}}(\hat{y}_{1}+\hat{y}_{2})-14\varepsilon(\hat{y}_{1}+\hat{y}_{2}).

In this case, we have $\hat{y}_{1},\hat{y}_{2}>0$ by definition of $i^{+}$ .

Sub-case 2.a:

$i^{*}\leq 2$ . Now $(\hat{a}_{1}+\hat{y}_{1})\hat{y}_{1}+(\hat{a}_{2}+\hat{y}_{2})\hat{y}_{2}=\hat{a}_{1}\hat{y}_{1}+\hat{a}_{2}\hat{y}_{2}+(\hat{y}_{1}-\hat{y}_{2})^{2}+2\hat{y}_{1}\hat{y}_{2}\geq\hat{a}_{1}\hat{y}_{1}+\hat{a}_{2}\hat{y}_{2}+2\hat{y}_{1}\hat{y}_{2}$ . If $i^{*}=1$ , then

	$\displaystyle\hat{a}_{1}\hat{y}_{1}+\hat{a}_{2}\hat{y}_{2}+2\hat{y}_{1}\hat{y}_{2}$
	$\displaystyle\geq\hat{a}_{1}\hat{y}_{1}+\hat{a}_{2}\hat{y}_{2}+2(\hat{a}_{1}-4\varepsilon)\hat{y}_{2}$		(using (20))
	$\displaystyle=\hat{a}_{1}(\hat{y}_{1}+\hat{y}_{2})+(\hat{a}_{2}+\hat{a}_{1})\hat{y}_{2}-8\varepsilon\hat{y}_{2}$
	$\displaystyle\geq\hat{a}_{1}(\hat{y}_{1}+\hat{y}_{2})-12\varepsilon\hat{y}_{2}$		(using (19))
	$\displaystyle\geq\hat{a}_{1}(\hat{y}_{1}+\hat{y}_{2})-12\varepsilon(\hat{y}_{1}+\hat{y}_{2})$		( $\hat{y}_{1}\geq 0$ )
	$\displaystyle=\hat{a}_{i^{*}}(\hat{y}_{1}+\hat{y}_{2})-12\varepsilon(\hat{y}_{1}+\hat{y}_{2})$		( $i^{*}=1$ )
	$\displaystyle\geq\hat{a}_{i^{*}}(\hat{y}_{1}+\hat{y}_{2})-14\varepsilon(\hat{y}_{1}+\hat{y}_{2})$		(22)

as required. By a similar calculation the claim follows if $i^{*}=2$ .

Sub-case 2.b:

If $i^{*}\geq 3$ , then $0\geq\hat{y}_{i^{*}}\geq\hat{a}_{i^{*}}-4\varepsilon$ by definition of $i^{+}$ and (20). By (19), we also have that $\hat{a}_{1}+\hat{a}_{i^{*}}\geq-4\varepsilon$ and $\hat{a}_{2}+\hat{a}_{i^{*}}\geq-4\varepsilon$ . Thus,

	$\displaystyle(\hat{a}_{1}+\hat{y}_{1})\hat{y}_{1}+(\hat{a}_{2}+\hat{y}_{2})\hat{y}_{2}$
	$\displaystyle=\hat{a}_{1}\hat{y}_{1}+\hat{y}_{1}^{2}+\hat{a}_{2}\hat{y}_{2}+\hat{y}_{2}^{2}$
	$\displaystyle\geq\hat{a}_{1}\hat{y}_{1}+\hat{a}_{2}\hat{y}_{2}$		( $\hat{y}_{1},\hat{y}_{2}>0$ in Case 2)
	$\displaystyle\geq(-\hat{a}_{i^{*}}-4\varepsilon)(\hat{y}_{1}+\hat{y}_{2})$
	$\displaystyle\geq(-\hat{a}_{i^{}}-4\varepsilon)(\hat{y}_{1}+\hat{y}_{2})+2(\hat{a}_{i^{}}-4\varepsilon)(\hat{y}_{1}+\hat{y}_{2})$		( $\hat{y}_{1},\hat{y}_{2}\geq 0$ and $\hat{a}_{i^{*}}-4\varepsilon\leq 0$ )
	$\displaystyle=\hat{a}_{i^{*}}(\hat{y}_{1}+\hat{y}_{2})-12\varepsilon(\hat{y}_{1}+\hat{y}_{2})$
	$\displaystyle\geq\hat{a}_{i^{*}}(\hat{y}_{1}+\hat{y}_{2})-14\varepsilon(\hat{y}_{1}+\hat{y}_{2})$		(23)

as required.

Case 3:

When $i^{+}\geq 3$ , our goal is to show

\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}\geq\hat{a}_{i^{*}}-14\varepsilon,

(24)

which is equivalent to (4) with $c=1$ and $d=14$ . By the ordering of $\{\hat{y}_{i}\}_{i\in[k]}$ and (3), for $i\leq i^{*}$ we have

\displaystyle\hat{y}_{i}\geq\hat{y}_{i^{*}}\geq\hat{a}_{i^{*}}-4\varepsilon.

(25)

Denote $r\in\operatorname*{arg\,min}_{i\in[k]}\hat{a}_{i}$ .

Sub-case 3.a: If $r=i^{*}$ , we have $\sum_{i}\hat{a}_{i}p_{i}\geq\hat{a}_{i^{*}}(\sum_{i}p_{i})=\hat{a}_{i^{*}}$ . Since $\sum_{i}\hat{y}_{i}p_{i}\geq 0$ and since from Line 11 in Algorithm 2, a positive probability $p_{i}>0$ is only assigned to positive types $i$ with positive $\hat{y}_{i}>0$ values,

\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}\geq 0+\hat{a}_{i^{*}}\geq\hat{a}_{i^{*}}-14\varepsilon.

Thus (24) follows.

Sub-case 3.b: If $r\neq i^{*}$ and $i^{*}\geq i^{+}$ , since from Line 11 in Algorithm 2, a positive probability $p_{i}>0$ is only assigned to positive $\hat{y}$ values, we have that $\sum_{i}\hat{y}_{i}p_{i}=\sum_{i\leq i^{+}}\hat{y}_{i}p_{i}\geq\sum_{i\leq i^{+}}(\hat{a}_{i^{*}}-4\varepsilon)p_{i}=\hat{a}_{i^{*}}-4\varepsilon$ by (25). When $\hat{a}_{r}\geq 0$ , since $\hat{a}_{i}\geq\hat{a}_{r}$ for any $i\neq r$ by definition of $r$ , we have $\sum_{i}\hat{a}_{i}p_{i}\geq 0$ . When $\hat{a}_{r}<0$ ,

$\displaystyle\sum_{i}\hat{a}_{i}p_{i}$	$\displaystyle\geq\sum_{i\neq r}(-\hat{a}_{r}-4\varepsilon)p_{i}+\hat{a}_{r}p_{r}$	(by (2))
	$\displaystyle=\hat{a}_{r}\left(p_{r}-\sum_{i\neq r}p_{i}\right)-4\varepsilon\sum_{i\neq r}p_{i}$
	$\displaystyle\geq 0-4\varepsilon,$

\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}\geq(\hat{a}_{i^{*}}-4\varepsilon)+(-4\varepsilon)\geq\hat{a}_{i^{*}}-14\varepsilon.

Therefore (24) holds.

Sub-case 3.c: If $r\neq i^{*}$ and $i^{*}<i^{+}$ , we have

$\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}$	$\displaystyle=\sum_{i=1}^{k}\hat{y}_{i}p_{i}+\sum_{i=1}^{k}\hat{a}_{i}p_{i}$
	$\displaystyle\geq\sum_{i\leq i^{}}(\hat{a}_{i^{}}-4\varepsilon)p_{i}+\sum_{i>i^{*}}(\hat{a}_{i}-4\varepsilon)p_{i}+\sum_{i=1}^{k}\hat{a}_{i}p_{i}$	(using (25) and (3))
	$\displaystyle=\sum_{i\leq i^{}}\hat{a}_{i^{}}p_{i}+\sum_{i>i^{*}}\hat{a}_{i}p_{i}+\sum_{i=1}^{k}\hat{a}_{i}p_{i}-4\varepsilon$
	$\displaystyle=\left(\sum_{i<i^{}}\hat{a}_{i^{}}p_{i}+2\hat{a}_{i^{}}p_{i^{}}\right)+\left(\sum_{i>i^{}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}\right)-4\varepsilon.$	(26)

The first term equals $\hat{a}_{i^{*}}$ by construction of $p_{i}$ ’s and since $i^{*}<i^{+}$ . Hence it suffices to show that the second term of (26) is greater than or equal to $-10\varepsilon$ .

Subsub-case 3.c.I: First we consider $\hat{a}_{r}\geq 0$ . By definition of $r$ , for all $i\in[k]$ , $\hat{a}_{i}\geq\hat{a}_{r}\geq 0$ . The second term of (26) can then be bounded as

\displaystyle\sum_{i>i^{*}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{*}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}\geq\hat{a}_{r}\left(p_{r}+\sum_{i>i^{*}}p_{i}+\sum_{i\neq r,i^{*}}p_{i}\right)\geq 0\geq-10\varepsilon.

(27)

Subsub-case 3.c.II: Second we consider $\hat{a}_{r}<0$ . Since $i^{*}<i^{+}$ , we have

$\displaystyle\sum_{i>i^{}}p_{i}+\sum_{i\neq r,i^{}}p_{i}$	$\displaystyle=\sum_{i^{}+1\leq i\leq i^{+}-1}p_{i}+p_{i^{+}}+\sum_{i\neq r,i^{}}p_{i}$
	$\displaystyle=\sum_{i=i^{}+1}^{i^{+}-1}\left(\frac{1}{2}\right)^{i}+\left(\frac{1}{2}\right)^{i^{+}-1}+\sum_{i\neq r,i^{}}p_{i}$	(by construction of $p_{i}$ ’s)
	$\displaystyle=\left(\frac{1}{2}\right)^{i^{}}+\sum_{i\neq r,i^{}}p_{i}$	(geometric sum)
	$\displaystyle=p_{i^{}}+\sum_{i\neq r,i^{}}p_{i}=1-p_{r}.$	(28)

Therefore, if $r<i^{*}$ , we get

	$\displaystyle\sum_{i>i^{}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}$
	$\displaystyle\geq\sum_{i>i^{}}(-\hat{a}_{r}-4\varepsilon)p_{i}+\sum_{i\neq r,i^{}}(-\hat{a}_{r}-4\varepsilon)p_{i}+\hat{a}_{r}p_{r}$		(using (2) )
	$\displaystyle=\hat{a}_{r}\left(p_{r}-\sum_{i>i^{}}p_{i}-\sum_{i\neq r,i^{}}p_{i}\right)-4\varepsilon\left(\sum_{i>i^{}}p_{i}+\sum_{i\neq r,i^{}}p_{i}\right)$
	$\displaystyle=\hat{a}_{r}\left(p_{r}-\left(1-p_{r}\right)\right)-4\varepsilon(1-p_{r})$		(using (28))
	$\displaystyle=\hat{a}_{r}\left(2p_{r}-1\right)-4\varepsilon(1-p_{r})$
	$\displaystyle\geq 0-4\varepsilon\geq-10\varepsilon.$		( $\hat{a}_{r}<0$ and $p_{r}\leq 1/2$ )

If $r>i^{*}$ . Then $p_{r}\leq 1/4$ by Line 11 in Algorithm 2 since $r\neq 1$ and $i^{+}\geq 3$ . Hence, by $\hat{a}_{r}<0$ , the second term of (26) can then be bounded as

	$\displaystyle\sum_{i>i^{}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}$
	$\displaystyle=\sum_{i>i^{},i\neq r}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+2\hat{a}_{r}p_{r}$
	$\displaystyle\geq\sum_{i>i^{},i\neq r}(-\hat{a}_{r}-4\varepsilon)p_{i}+\sum_{i\neq r,i^{}}(-\hat{a}_{r}-4\varepsilon)p_{i}+2\hat{a}_{r}p_{r}$		(using (2))
	$\displaystyle=\hat{a}_{r}\left(2p_{r}-\sum_{i>i^{},i\neq r}p_{i}-\sum_{i\neq r,i^{}}p_{i}\right)-4\varepsilon\left(\sum_{i>i^{},i\neq r}p_{i}+\sum_{i\neq r,i^{}}p_{i}\right)$
	$\displaystyle=\hat{a}_{r}(2p_{r}-(1-2p_{r}))-4\varepsilon(1-2p_{r})$		(using (28))
	$\displaystyle\geq 0-4\varepsilon\geq-10\varepsilon.$		( $\hat{a}_{r}<0$ and $p_{r}\leq 1/4$ )

Thus we conclude that when $\hat{a}_{r}<0$ , we have

\displaystyle\sum_{i>i^{*}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{*}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}\geq-10\varepsilon.

(29)

Combining (29) and (27), we conclude Case 3 of the proof. ∎

Algorithm 3 A randomized algorithm for monotone

k

-submodular maximization without constraints (adapted from (Iwata et al., 2015))

1: Input: ground set

V

, value oracle access for a monotone

k

-submodular function

f:(k+1)^{V}\rightarrow\mathbb{R}_{+}

2: Output: a vector

\boldsymbol{s}\in(k+1)^{V}

3: Initialize

\boldsymbol{s}\leftarrow\mathbf{0}_{n}

t\leftarrow k-1

4: for

e\in V

y_{i}\leftarrow\Delta_{e,i}f(\boldsymbol{s})

for

i\in[k]

\beta\leftarrow\sum_{i=1}^{k}[y_{i}]_{+}^{t}

7: Initialize

p\leftarrow\mathbf{0}_{k}

8: if

\beta\neq 0

then

p_{i}\leftarrow\frac{y_{i}^{t}}{\beta}

for

i\in[k]

10: else

11:

p_{i}\leftarrow\begin{cases}1&\text{if }i=1\\ 0&\text{otherwise.}\end{cases}

for

i\in[k]

12: end if

13: Choose

\boldsymbol{s}(e)\sim\mathbb{P}(\boldsymbol{s}(e)=i)=p_{i}

for all

i\in[k]

14: end for

15: Return

\boldsymbol{s}

A.3 Proof of Proposition 4.1

We use the same setup and constructions as in Section A.2 and further, since we are considering the monotone case, we have $a_{i}^{(j)}\geq 0$ and $y_{i}^{(j)}\geq 0$ for all $i\in[k]$ and $j\in[n]$ . Thus, for the surrogate function $\hat{f}$ , we have

\displaystyle\hat{a}_{i}^{(j)}\geq-2\varepsilon\text{ and }\hat{y}_{i}^{(j)}\geq-2\varepsilon.

(30)

Now start to prove Proposition 4.1. Again, we shall omit the superscript $(j)$ if it is clear from the context.

Proof of Proposition 4.1.

It is trivial to see the number of value oracle queries required is $nk$ , as it iterates through all item-type pairs. The following proof will be dedicated to showing the $\alpha$ and $\delta$ terms in Definition 2.2.

\displaystyle\sum_{i=1}^{k}(\hat{a}_{i^{*}}-\hat{a}_{i})p_{i}\leq(1-\frac{1}{k})\sum_{i=1}^{k}\hat{y}_{i}p_{i}+10\varepsilon,

(31)

which is equivalent to (4) in main text with $c=1-\frac{1}{k}$ and $d=10$ . Recall that $\hat{y}_{i}+\hat{y}_{i^{\prime}}\geq-4\varepsilon$ and $\hat{a}_{i}+\hat{a}_{i^{\prime}}\geq-4\varepsilon$ for $i,i^{\prime}\in[k]$ with $i\neq i^{\prime}$ , and $\hat{y}_{i}\geq\hat{a}_{i}-4\varepsilon$ for $i\in[k]$ (c.f. (18),(19) and (20)). We break up the analysis into two cases: $\beta=0$ and $\beta>0$ .

Case 1:

When $\beta=0$ , we have $\hat{y}_{i}\leq 0$ for all $i\in[k]$ due to the definition of $\beta$ . In this case, (31) specializes to

\displaystyle\hat{a}_{i^{*}}-\hat{a}_{1}\leq(1-\frac{1}{k})\hat{y}_{1}+10\varepsilon.

(32)

We have

$\displaystyle\hat{a}_{i^{*}}-\hat{a}_{1}$	$\displaystyle\leq\hat{a}_{i^{*}}+2\varepsilon$	(using $\hat{a}_{1}\geq-2\varepsilon$ )
	$\displaystyle\leq 6\varepsilon$	(using $\hat{a}_{i^{}}-4\varepsilon\leq\hat{y}_{i^{}}\leq 0$ )
	$\displaystyle\leq-\hat{y}_{i^{*}}+6\varepsilon$	(using $\hat{y}_{i^{*}}\leq 0$ )
	$\displaystyle\leq\hat{y}_{1}+10\varepsilon$	(using $\hat{y}_{i^{*}}+\hat{y}_{1}\geq-4\varepsilon$ )
	$\displaystyle\leq(1-\frac{1}{k})\hat{y}_{1}+10\varepsilon,$	(using $\hat{y}_{1}\leq 0$ )

showing the statement.

Case 2:

When $\beta>0$ , we have $\hat{y}_{i}>0$ for some $i\in[k]$ . In this case, (31) specializes to

\displaystyle\sum_{i=1}^{k}(\hat{a}_{i^{*}}-\hat{a}_{i})\hat{y}_{i}^{t}\leq(1-\frac{1}{k})\sum_{i=1}^{k}\hat{y}_{i}^{t+1}+10\beta\varepsilon.

(33)

Sub-case 2.a:

If $k=1$ , the LHS of (33) is 0 since $i^{*}=1$ . RHS is positive since both $\hat{y}_{i}$ and $\beta$ are positive. Thus, (33) clearly holds.

Sub-case 2.b:

If $k\geq 2$ . Let $\gamma=(k-1)^{\frac{1}{t}}=t^{\frac{1}{t}}$ . We have

$\displaystyle\sum_{i\neq i^{}}(\hat{a}_{i^{}}-\hat{a}_{i})\hat{y}_{i}^{t}$	$\displaystyle\leq\sum_{i\neq i^{}}(\hat{a}_{i^{}}+2\varepsilon)\hat{y}_{i}^{t}$	(using $\hat{a}_{i}\geq-2\varepsilon$ )
	$\displaystyle\leq\sum_{i\neq i^{}}(\hat{y}_{i^{}}+6\varepsilon)\hat{y}_{i}^{t}$	(using $\hat{a}_{i}-4\varepsilon\leq\hat{y}_{i}$ )
	$\displaystyle=\sum_{i\neq i^{}}\hat{y}_{i^{}}\hat{y}_{i}^{t}+6\varepsilon\sum_{i\neq i^{*}}\hat{y}_{i}^{t}$
	$\displaystyle\leq\sum_{i\neq i^{}}\hat{y}_{i^{}}\hat{y}_{i}^{t}+6\varepsilon\sum_{i\neq i^{*}}[\hat{y}_{i}]_{+}^{t}$
	$\displaystyle=\sum_{i\neq i^{}}\hat{y}_{i^{}}\hat{y}_{i}^{t}+6\varepsilon(\beta-[\hat{y}_{i^{*}}]_{+}^{t})$	(definition of $\beta$ )
	$\displaystyle\leq\sum_{i\neq i^{}}\hat{y}_{i^{}}\hat{y}_{i}^{t}+6\varepsilon\beta$
	$\displaystyle=\frac{1}{\gamma}\left(\gamma\hat{y}_{i^{}}\sum_{i\neq i^{}}\hat{y}_{i}^{t}\right)+6\varepsilon\beta$	(34)

Using the weighted AM-GM inequality, $a^{\frac{1}{t+1}}b^{\frac{t}{t+1}}\leq\frac{1}{t+1}a+\frac{t}{t+1}b$ for all $a,b\geq 0$ , and letting $a=(\gamma\hat{y}_{i^{*}})^{t+1}$ and $b=(\sum_{i\neq i^{*}}\hat{y}_{i}^{t})^{\frac{t+1}{t}}$ , we deduce

\displaystyle\frac{1}{\gamma}\left(\gamma\hat{y}_{i^{*}}\sum_{i\neq i^{*}}\hat{y}_{i}^{t}\right)

\displaystyle\leq\frac{1}{\gamma}\left(\frac{1}{t+1}\left(\gamma\hat{y}_{i^{*}}\right)^{t+1}+\frac{t}{t+1}(\sum_{i\neq i^{*}}\hat{y}_{i}^{t})^{\frac{t+1}{t}}\right)

(35)

From Holder’s inequality, $\sum_{i}a_{i}\leq(\sum_{i}a_{i}^{\frac{t+1}{t}})^{\frac{t}{t+1}}(\sum_{i}1^{t+1})^{\frac{1}{t+1}}$ holds for any non-negative $a_{i}$ ’s. By setting $a_{i}=\hat{y}_{i}^{t}$ , we have

$\displaystyle\frac{1}{\gamma}\left(\gamma\hat{y}_{i^{}}\sum_{i\neq i^{}}\hat{y}_{i}^{t}\right)$	$\displaystyle\leq\frac{1}{\gamma}\left(\frac{1}{t+1}\left(\gamma\hat{y}_{i^{}}\right)^{t+1}+\frac{t}{t+1}\left(\sum_{i\neq i^{}}\hat{y}_{i}^{t+1}\right)\cdot\left(\sum_{i\neq i^{*}}1^{t+1}\right)^{\frac{1}{t}}\right)$
	$\displaystyle=\frac{1}{\gamma}\left(\frac{1}{t+1}\left(\gamma\hat{y}_{i^{}}\right)^{t+1}+\frac{t(k-1)^{1/t}}{t+1}\sum_{i\neq i^{}}\hat{y}_{i}^{t+1}\right)$
	$\displaystyle=\frac{\gamma^{t}}{t+1}\sum_{i=1}^{k}\hat{y}_{i}^{t+1}$
	$\displaystyle=(1-\frac{1}{k})\sum_{i=1}^{k}\hat{y}_{i}^{t+1}.$	(36)

Plug (36) back into (34), we get

	$\displaystyle\sum_{i\neq i^{}}(\hat{a}_{i^{}}-\hat{a}_{i})\hat{y}_{i}^{t}$	$\displaystyle\leq(1-\frac{1}{k})\sum_{i=1}^{k}\hat{y}_{i}^{t+1}+6\varepsilon\beta$
		$\displaystyle\leq(1-\frac{1}{k})\sum_{i=1}^{k}\hat{y}_{i}^{t+1}+10\varepsilon\beta,$		(37)

which is desired.

∎

Algorithm 4 Monotone

k

-submodular maximization with a IS constraint (Ohsaka & Yoshida, 2015)

1: Input: value oracle access for a monotone

k

-submodular function

f:(k+1)^{V}\rightarrow\mathbb{R}_{+}

, integers

B_{1},\ldots,B_{k}\in\mathbb{Z}_{+}

2: Output: a vector

\boldsymbol{s}

satisfying

\operatorname{supp}_{i}(\boldsymbol{s})=B_{i}

for each

i\in[k]

3: Initialize

\boldsymbol{s}\leftarrow\mathbf{0}

and

B\leftarrow\sum_{i\in[k]}B_{i}

4: for

j\in[B]

I\leftarrow\{i\in[k]|\operatorname{supp}_{i}(\boldsymbol{s})<B_{i}\}

(e,i)\leftarrow\operatorname*{arg\,max}_{e\in V\setminus\operatorname{supp}(\boldsymbol{s}),i\in I}\Delta_{e,i}f(\boldsymbol{s})

\boldsymbol{s}(e)\leftarrow i

8: end for

9: Return

\boldsymbol{s}

A.4 Proof of Proposition 5.1

Proof.

Let $\left(e^{(j)},i^{(j)}\right)\in V\times[k]$ be the pair greedily chosen in this iteration using the surrogate function $\hat{f}$ , and let $\boldsymbol{s}^{(j)}$ be the solution after this iteration. Let $\boldsymbol{o}$ be an optimal solution. For each $j\in[B]$ , we define $S_{i}^{(j)}=\operatorname{supp}_{i}(\boldsymbol{o}^{(j-1)})\backslash\operatorname{supp}_{i}(\boldsymbol{s}^{(j-1)})$ . Following the construction of Ohsaka & Yoshida (2015), we iteratively define $\boldsymbol{o}^{(0)}=\boldsymbol{o},\boldsymbol{o}^{(1)},\ldots,\boldsymbol{o}^{(B)}$ as follows. We have two cases to consider:

Case 1: Suppose that $e^{(j)}\in S_{i^{\prime}}^{(j)}$ for some $i^{\prime}\neq i^{(j)}$ . In this case, let $o^{(j)}$ be an arbitrary element in $S_{i^{(j)}}^{(j)}$ . Then, we define $\boldsymbol{o}^{(j-1/2)}$ as the resulting vector obtained from $\boldsymbol{o}^{(j-1)}$ by assigning 0 to the $e^{(j)}$ -th element and the $o^{(j)}$ -th element, and then define $\boldsymbol{o}^{(j)}$ as the resulting vector obtained from $\boldsymbol{o}^{(j-1/2)}$ by assigning $i^{(j)}$ to the $e^{(j)}$ -th element and $i^{\prime}$ to the $o^{(j)}$ -th element.

Case 2: Suppose that $e^{(j)}\notin S_{i^{\prime}}^{(j)}$ for any $i^{\prime}\neq i^{(j)}$ . In this case, we set $o^{(j)}=e^{(j)}$ if $e^{(j)}\in S_{i^{(j)}}^{(j)}$ , and we set $o^{(j)}$ to be an arbitrary element in $S_{i^{(j)}}^{(j)}$ otherwise. Then, we define $\boldsymbol{o}^{(j-1/2)}$ as the resulting vector obtained from $\boldsymbol{o}^{(j-1)}$ by assigning 0 to the $o^{(j)}$ -th element, and then define $\boldsymbol{o}^{(j)}$ as the resulting vector obtained from $\boldsymbol{o}^{(j-1/2)}$ by assigning $i^{(j)}$ to the $e^{(j)}$ -th element.

Intuitively, we want a transition from $\boldsymbol{o}$ to $\boldsymbol{s}$ by swapping in item-type pairs in $\boldsymbol{s}$ one by one to $\boldsymbol{o}$ , in the order of the pair is chosen by the greedy algorithm. In case 1, if the considered item $e^{(j)}$ is already in $\boldsymbol{o}^{(j-1)}$ but with another type $i^{\prime}\neq i^{(j)}$ , we obtain $\boldsymbol{o}^{(j-1/2)}$ from $\boldsymbol{o}^{(j-1)}$ by assign type 0 to item $e^{(j)}$ , and obtain $\boldsymbol{o}^{(j)}$ from $\boldsymbol{o}^{(j-1/2)}$ by assign type $i^{(j)}$ to item $e^{(j)}$ . In case 2, if there is no type $i^{\prime}\neq i^{(j)}$ that is assigned to item $e^{(j)}$ in $\boldsymbol{o}^{(j-1)}$ , there are two sub-cases. In the first sub-case, if $(e^{(j)},i^{(j)})$ is already in $\boldsymbol{o}^{(j-1)}$ , we set $\boldsymbol{o}^{(j)}=\boldsymbol{o}^{(j-1)}$ and obtain $\boldsymbol{o}^{(j-1/2)}$ from $\boldsymbol{o}^{(j-1)}$ by assign type 0 to item $e^{(j)}$ . In the second sub-case, if $e^{(j)}$ is not assigned any type in $\boldsymbol{o}^{(j-1)}$ , we obtain $\boldsymbol{o}^{(j-1/2)}$ from $\boldsymbol{o}^{(j-1)}$ by assign type 0 to an arbitrary item with type $i^{(j)}$ in $\boldsymbol{o}^{(j-1)}$ , and obtain $\boldsymbol{o}^{(j)}$ from $\boldsymbol{o}^{(j-1/2)}$ by assign type $i^{(j)}$ to item $e^{(j)}$ . Note that $\left|\operatorname{supp}_{i}(\boldsymbol{o}^{(j)})\right|=B_{i}$ holds for every $i\in[k]$ and $j\in\{0,1,\ldots,B\}$ , and $\boldsymbol{o}^{(B)}=\boldsymbol{s}^{(B)}=\boldsymbol{s}$ . Moreover, we have $\boldsymbol{s}^{(j-1)}\preceq\boldsymbol{o}^{(j-1/2)}$ for every $j\in[B]$ . We further denote $\Delta_{e,i}\hat{f}(\boldsymbol{x})=\hat{f}(X_{1},\ldots,X_{i-1},X_{i}\cup\{e\},X_{i+1},\ldots,X_{k})-\hat{f}(X_{1},\ldots,X_{k})$ as analogous to $\Delta_{e,i}f(\boldsymbol{x})$ under surrogate function $\hat{f}$ .

We consider $f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})$ in these two cases.

Case 1: In this case, $e^{(j)}\in S_{i^{\prime}}^{(j)}$ for some $i^{\prime}\neq i^{(j)}$ . By construction, we have

f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j-1/2)})=\Delta_{o^{(j)},i^{(j)}}f\left(\boldsymbol{o}^{(j-1/2)}\right)+\Delta_{e^{(j)},i^{\prime}}f\left(\boldsymbol{o}^{(j-1/2)}\right)

and

f(\boldsymbol{o}^{(j)})-f(\boldsymbol{o}^{(j-1/2)})=\Delta_{e^{(j)},i^{(j)}}f\left(\boldsymbol{o}^{(j-1/2)}\right)+\Delta_{o^{(j)},i^{\prime}}f\left(\boldsymbol{o}^{(j-1/2)}\right).

Thus,

$\displaystyle f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})$	$\displaystyle=\left(f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j-1/2)})\right)-\left(f(\boldsymbol{o}^{(j)})-f(\boldsymbol{o}^{(j-1/2)})\right)$
	$\displaystyle=\Delta_{o^{(j)},i^{(j)}}f\left(\boldsymbol{o}^{(j-1/2)}\right)+\Delta_{e^{(j)},i^{\prime}}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})$
	$\displaystyle\qquad-\Delta_{o^{(j)},i^{\prime}}f(\boldsymbol{o}^{(j-1/2)})$	(by construction)
	$\displaystyle\leq\Delta_{o^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})+\Delta_{e^{(j)},i^{\prime}}f(\boldsymbol{o}^{(j-1/2)})$	(monotonicity)
	$\displaystyle\leq\Delta_{o^{(j)},i^{(j)}}f(\boldsymbol{s}^{(j-1)})+\Delta_{e^{(j)},i^{\prime}}f(\boldsymbol{s}^{(j-1)})$	( $\boldsymbol{s}^{(j-1)}\preceq\boldsymbol{o}^{(j-1/2)}$ and orthant submodularity)
	$\displaystyle\leq\Delta_{o^{(j)},i^{(j)}}\hat{f}(\boldsymbol{s}^{(j-1)})+\Delta_{e^{(j)},i^{\prime}}\hat{f}(\boldsymbol{s}^{(j-1)})+4\varepsilon$	(38)
	$\displaystyle\leq 2\Delta_{e^{(j)},i^{(j)}}\hat{f}(\boldsymbol{s}^{(j-1)})+4\varepsilon$	(39)
	$\displaystyle=2(\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)}))+4\varepsilon.$	(40)

where the last inequality follows from greedy rule and that both pair $\left(o^{(j)},i^{(j)}\right)$ and $\left(e^{(j)},i^{\prime}\right)$ are available after we selected $\boldsymbol{s}^{(j)}$ , which we verify as follows: since $o^{(j)}\in S_{i^{(j)}}^{(j)}$ , $o^{(j)}$ is still available; $i^{(j)}$ is the type of the next greedy pair, thus available; $e^{(j)}$ is the item of the next greedy pair, thus available; $e^{(j)}\in S_{i^{\prime}}^{(j)}$ indicates $i^{\prime}$ is still available.

Case 2: In this case, $e^{(j)}\notin S_{i^{\prime}}^{(j)}$ for any $i^{\prime}\neq i^{(j)}$ .

$\displaystyle f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})$	$\displaystyle=\Delta_{o^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})$	(by construction)
	$\displaystyle\leq\Delta_{o^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})$	(monotonicity)
	$\displaystyle\leq\Delta_{o^{(j)},i^{(j)}}f(\boldsymbol{s}^{(j-1)})$	( $\boldsymbol{s}^{(j-1)}\preceq\boldsymbol{o}^{(j-1/2)}$ and orthant submodularity)
	$\displaystyle\leq\Delta_{o^{(j)},i^{(j)}}\hat{f}(\boldsymbol{s}^{(j-1)})+2\varepsilon$	(41)
	$\displaystyle\leq\Delta_{e^{(j)},i^{(j)}}\hat{f}(\boldsymbol{s}^{(j-1)})+2\varepsilon$	(greedy rule and the pair $\left(o^{(j)},i^{(j)}\right)$ is available)
	$\displaystyle\leq 2(\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)}))+4\varepsilon.$	(42)

Combining (40) and (42) we have in both cases,

\displaystyle f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})\leq 2(\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)}))+4\varepsilon.

(43)

Finally,

$\displaystyle f(\boldsymbol{o})-f(\boldsymbol{s})$	$\displaystyle=\sum_{j=1}^{B}\left(f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})\right)$	(44)
	$\displaystyle\leq 2\sum_{j=1}^{B}\left(\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)})\right)+4B\varepsilon$	(using (43))
	$\displaystyle=2(\hat{f}(\boldsymbol{s})-\hat{f}(\mathbf{0}))+4B\varepsilon$
	$\displaystyle\leq 2(f(\boldsymbol{s})-f(\mathbf{0}))+4(B+1)\varepsilon$
	$\displaystyle\leq 2f(\boldsymbol{s})+4(B+1)\varepsilon,$

Rearranging, we get $f(\boldsymbol{s})\geq\frac{1}{3}f(\boldsymbol{o})-\frac{4}{3}(B+1)\varepsilon$ .

∎

Algorithm 5

k

-submodular maximization with a matroid constraint (Sakaue, 2017)

1: Input: value oracle access for a

k

-submodular function

f:(k+1)^{V}\rightarrow\mathbb{R}_{+}

, a matroid

(V,\mathcal{F})

, given by the evaluation and independence oracle respectively.

2: Output: a vector

\boldsymbol{s}

satisfying

\operatorname{supp}(\boldsymbol{s})\in\mathcal{B}

3: Initialize

\boldsymbol{s}\leftarrow\mathbf{0}

4: for

j\in[M]

5: Construct

E(\boldsymbol{s})

using the independence oracle.

(e,i)\leftarrow\operatorname*{arg\,max}_{e\in E(\boldsymbol{s}),i\in[k]}\Delta_{e,i}f(\boldsymbol{s})

\boldsymbol{s}(e)\leftarrow i

8: end for

9: Return

\boldsymbol{s}

A.5 Proof of Proposition 6.1

Denote $\mathcal{B}$ as the set of all bases associated with a matroid $(E,\mathcal{F})$ . We first state two important lemmas that will be used in the proof. We refer to Sakaue (2017) for the detailed proofs.

Lemma A.2.

(Sakaue, 2017) The size of any maximal optimal solution for maximizing a monotone $k$ -submodular function under a matroid constraint is $M$ .

Lemma A.3.

(Sakaue, 2017) Suppose $A\in\mathcal{F}$ and $B\in\mathcal{B}$ satisfy $A\subsetneq B$ . Then, for any $e\notin A$ satisfying $A\cup\{e\}\in\mathcal{F}$ , there exists $e^{\prime}\in B\setminus A$ such that $\{B\setminus\{e^{\prime}\}\}\cup\{e\}\in\mathcal{B}$ .

Now we prove Proposition 6.1.

Proof.

Let $\left(e^{(j)},i^{(j)}\right)$ be the pair chosen greedily at the $j$ th iteration using the surrogate function $\hat{f}$ , and $\boldsymbol{s}^{(j)}$ be the solution after the $j$ th iteration. Let $s^{(0)}=\mathbf{0}$ and $s=s^{(M)}$ , the output of Algorithm 5. Define $S^{(j)}:=\operatorname{supp}(\boldsymbol{s}^{(j)})$ for each $j\in[M]$ . Let $\boldsymbol{o}$ be a maximal optimal solution and let $O^{(j)}:=\operatorname{supp}(\boldsymbol{o}^{(j)})$ for each $j\in[M]$ . Now we will construct a sequence of vectors $\boldsymbol{o}^{(0)}=\boldsymbol{o},\boldsymbol{o}^{(1)},\ldots,\boldsymbol{o}^{(M-1)},\boldsymbol{o}^{(M)}=\boldsymbol{s}$ satisfying the following:

	$\displaystyle\boldsymbol{s}^{(j)}\prec\boldsymbol{o}^{(j)}\quad\text{ if }$	$\displaystyle j=0,1,\ldots,M-1,\quad\text{ and }\quad\boldsymbol{s}^{(j)}=\boldsymbol{o}^{(j)}=\boldsymbol{s}\quad\text{ if }j=M.$		(45)
	$\displaystyle O^{(j)}\in\mathcal{B}\quad\text{ for }$	$\displaystyle j=0,1,\ldots,M.$		(46)

More specifically, we see how to obtain $\boldsymbol{o}^{(j)}$ from $\boldsymbol{o}^{(j-1)}$ satisfying (45) and (46). Note that $\boldsymbol{s}^{(0)}=\mathbf{0}$ and $\boldsymbol{o}^{(0)}=\boldsymbol{o}$ satisfy (45) and (46). We now describe how to obtain $\boldsymbol{o}^{(j)}$ from $\boldsymbol{o}^{(j-1)}$ , assuming that $\boldsymbol{o}^{(j-1)}$ satisfies

\boldsymbol{s}^{(j-1)}\prec\boldsymbol{o}^{(j-1)}\text{, and }O^{(j-1)}\in\mathcal{B}.

Since $\boldsymbol{s}^{(j-1)}\prec\boldsymbol{o}^{(j-1)}$ means $S^{(j-1)}\subsetneq O^{(j-1)}$ , and $e^{(j)}$ is chosen to satisfy $S^{(j-1)}\cup\left\{e^{(j)}\right\}\in\mathcal{F}$ , we see from Lemma A.3 that there exists $e^{\prime}\in O^{(j-1)}\setminus S^{(j-1)}$ satisfying $\{O^{(j-1)}\setminus\{e^{\prime}\}\}\cup\{e^{(j)}\}\in\mathcal{B}$ . We let $o^{(j)}=e^{\prime}$ and define $\boldsymbol{o}^{(j-1/2)}$ as the vector obtained by assigning 0 to the $o^{(j)}$ th element of $\boldsymbol{o}^{(j-1)}$ . We then define $\boldsymbol{o}^{(j)}$ as the vector obtained from $\boldsymbol{o}^{(j-1/2)}$ by assigning type $i^{(j)}$ to element $e^{(j)}$ . The vector thus constructed, $\boldsymbol{o}^{(j)}$ , satisfies

O^{(j)}=\{O^{(j-1)}\setminus\{o^{(j)}\}\}\cup\{e^{(j)}\}\in\mathcal{B}.

Furthermore, since $\boldsymbol{o}^{(j-1/2)}$ satisfies

\boldsymbol{s}^{(j-1)}\preceq\boldsymbol{o}^{(j-1/2)},

we have the following property for $\boldsymbol{o}^{(j)}$ :

\boldsymbol{s}^{(j)}\prec\boldsymbol{o}^{(j)}\quad\text{ if }j=1,\ldots,M-1,\quad\text{ and }\quad\boldsymbol{s}^{(j)}=\boldsymbol{o}^{(j)}=\boldsymbol{s}\text{ if }j=M,

where the strictness of the inclusion for $j\in[M-1]$ can be easily confirmed from $|S^{(j)}|=j<M=|O^{(j)}|$ . Thus, applying the above discussion for $j=1,\ldots,M$ iteratively, we see the obtained sequence of vectors $\boldsymbol{o}^{(0)},\boldsymbol{o}^{(1)},\ldots,\boldsymbol{o}^{(M)}$ satisfies (45) and (46). We now prove the following inequality for $j\in[M]$ :

\displaystyle\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)})\geq f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})-2\varepsilon.

(47)

From $S^{(j-1)}\subsetneq O^{(j-1)}$ and $o^{(j)}\in O^{(j-1)}\backslash S^{(j-1)}$ , we get $S^{(j-1)}\cup\left\{o^{(j)}\right\}\subseteq O^{(j-1)}\in\mathcal{B}$ for each $j\in[M]$ . Thus we obtain the following inclusion from (M2) for each $j\in[M]$ :

S^{(j-1)}\cup\{o^{(j)}\}\in\mathcal{F}.

Hence, from $o^{(j)}\notin S^{(j-1)}$ , we get $o^{(j)}\in E\left(\boldsymbol{s}^{(j-1)}\right)$ , where $E(\boldsymbol{s})$ is constructed by the independence oracle and contains the available elements given current solution $\boldsymbol{s}$ . Therefore, for the pair $\left(e^{(j)},i^{(j)}\right)$ , which is chosen greedily, we have

\displaystyle\Delta_{e^{(j)},i^{(j)}}\hat{f}(\boldsymbol{s}^{(j-1)})\geq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}\hat{f}(\boldsymbol{s}^{(j-1)}).

(48)

Furthermore, since $\boldsymbol{s}^{(j-1)}\preceq\boldsymbol{o}^{(j-1/2)}$ holds, orthant submodularity implies

\displaystyle\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{s}^{(j-1)})\geq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{o}^{(j-1/2)}).

(49)

Using (48) and (49), we have:

$\displaystyle f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})$	$\displaystyle=\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})$	(by construction)
	$\displaystyle\leq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{o}^{(j-1/2)})$	(monotonicity)
	$\displaystyle\leq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{s}^{(j-1)})$	(using (49))
	$\displaystyle\leq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}\hat{f}(\boldsymbol{s}^{(j-1)})+2\varepsilon$
	$\displaystyle\leq\Delta_{e^{(j)},i^{(j)}}\hat{f}(\boldsymbol{s}^{(j-1)})+2\varepsilon$	(using (48))
	$\displaystyle=\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)})+2\varepsilon,$	(50)

and we get (47). Finally,

$\displaystyle f(\boldsymbol{o})-f(\boldsymbol{s})$	$\displaystyle=\sum_{j=1}^{M}\left(f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})\right)$	(51)
	$\displaystyle\leq\sum_{j=1}^{M}\left(\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)})\right)+2M\varepsilon$	(using (47))
	$\displaystyle=(\hat{f}(\boldsymbol{s})-\hat{f}(\mathbf{0}))+2M\varepsilon$
	$\displaystyle\leq(f(\boldsymbol{s})-f(\mathbf{0}))+2(M+1)\varepsilon$
	$\displaystyle\leq f(\boldsymbol{s})+2(M+1)\varepsilon,$

Rearranging, we get $f(\boldsymbol{s})\geq\frac{1}{2}f(\boldsymbol{o})-(M+1)\varepsilon$ .

∎

A.6 Proof of Proposition 7.1

The following lemma guarantees that there exists an optimal solution with all items included (so we just need to decide on the types).

Lemma A.4.

(Sun et al., 2022) The size of any maximal optimal solution for maximizing a non-monotone $k$ -submodular function under a matroid constraint is still $M$ .

Now we prove Proposition 7.1.

Proof.

We construct the sequence $\boldsymbol{o}^{(j)}$ , $\boldsymbol{o}^{(j-1/2)}$ the same way as in the proof for Proposition 6.1. By the same construction, we have that

	$\displaystyle\boldsymbol{s}^{(j)}\prec\boldsymbol{o}^{(j)}\quad\text{ if }$	$\displaystyle j=0,1,\ldots,M-1,\quad\text{ and }\quad\boldsymbol{s}^{(j)}=\boldsymbol{o}^{(j)}=\boldsymbol{s}\quad\text{ if }j=M.$		(52)
	$\displaystyle O^{(j)}\in\mathcal{B}\quad\text{ for }$	$\displaystyle j=0,1,\ldots,M.$		(53)

and we still have (48) and (49). Since we do not have monotonicity anymore, we will exploit the pairwise submodularity property. Besides the chosen type $i^{(j)}$ for the item $e^{(j)}$ at each iteration of Algorithm 5, we consider another type $h^{(j)}\neq i^{(j)}$ . From pariwise monotonicity we have

\displaystyle\Delta_{e^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})+\Delta_{e^{(j)},h^{(j)}}f(\boldsymbol{o}^{(j-1/2)})\geq 0.

(54)

We perform similar calculation as in the proof for Proposition 6.1:

$\displaystyle\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)})$	$\displaystyle=\Delta_{e^{(j)},i^{(j)}}\hat{f}(\boldsymbol{s}^{(j-1)})$
	$\displaystyle\geq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}\hat{f}(\boldsymbol{s}^{(j-1)})$	(using (48))
	$\displaystyle\geq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{s}^{(j-1)})-2\varepsilon$
	$\displaystyle\geq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{o}^{(j-1/2)})-2\varepsilon$	(using (49))
	$\displaystyle\geq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},h^{(j)}}f(\boldsymbol{o}^{(j-1/2)})-2\varepsilon$	(using (54))
	$\displaystyle\geq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},h^{(j)}}f(\boldsymbol{s}^{(j-1)})-2\varepsilon$	( $\boldsymbol{s}^{(j-1)}\preceq\boldsymbol{o}^{(j-1/2)}$ as shown in previous section)
	$\displaystyle\geq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},i^{(j)}}f(\boldsymbol{s}^{(j-1)})-2\varepsilon$	(greedy rule)
	$\displaystyle\geq\Delta_{o^{(j)},\boldsymbol{o}^{(j-1)}(o^{(j)})}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},i^{(j)}}f(\boldsymbol{o}^{(j-1/2)})-\Delta_{e^{(j)},i^{(j)}}\hat{f}(\boldsymbol{s}^{(j-1)})-4\varepsilon$
	$\displaystyle=f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j-1/2)})-f(\boldsymbol{o}^{(j)})+f(\boldsymbol{o}^{(j-1/2)})-(\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)}))-4\varepsilon$
	$\displaystyle=f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})-(\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)}))-4\varepsilon.$

Adding both sides by $\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)})$ we get

\displaystyle 2(\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)}))\geq f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})-4\varepsilon.

(55)

Finally,

$\displaystyle f(\boldsymbol{o})-f(\boldsymbol{s})$	$\displaystyle=\sum_{j=1}^{M}\left(f(\boldsymbol{o}^{(j-1)})-f(\boldsymbol{o}^{(j)})\right)$	(56)
	$\displaystyle\leq 2\sum_{j=1}^{M}\left(\hat{f}(\boldsymbol{s}^{(j)})-\hat{f}(\boldsymbol{s}^{(j-1)})\right)+4B\varepsilon$	(using (55))
	$\displaystyle=2(\hat{f}(\boldsymbol{s})-\hat{f}(\mathbf{0}))+4M\varepsilon$
	$\displaystyle\leq 2(f(\boldsymbol{s})-f(\mathbf{0}))+4(M+1)\varepsilon$
	$\displaystyle\leq 2f(\boldsymbol{s})+4(M+1)\varepsilon,$

Rearranging, we get $f(\boldsymbol{s})\geq\frac{1}{3}f(\boldsymbol{o})-\frac{4}{3}(M+1)\varepsilon$ . ∎

Appendix B Missing Offline Algorithms

We present the pseudo-codes that is missing in the main sections. Algorithm 3 is adapted from Iwata et al. (2015) for the problem of maximizing an unconstrained monotone $k$ -submobular function; Algorithm 4 is proposed in Ohsaka & Yoshida (2015) for the problem of maximizing a monotone $k$ -submobular function under IS constraints; and Algorithm 5 is proposed in Sakaue (2017) for the problem of maximizing a monotone $k$ -submobular function under a matroid constraint. Later Sun et al. (2022) showed that Algorithm 5 can also handle non-monotone objective functions. One thing to be noticed is that in Line 4, Algorithm 5 requires the value of $M$ , the rank of the matroid. However, in practice, we need not calculate the value of $M$ beforehand. Instead, we continue the iteration while $E(\boldsymbol{s})$ is nonempty, which we check in Line 5. We can confirm that this modification does not change the output as follows. As long as $|\operatorname{supp}(\boldsymbol{s})|<M$ , exactly one element is added to $\operatorname{supp}(\boldsymbol{s})$ at each iteration due to (M3), and, if $|\operatorname{supp}(\boldsymbol{s})|=M$ , the iteration stops since $\operatorname{supp}(\boldsymbol{s})$ is a maximal independent set.

Appendix C More Related Works

Offline $k$ -submodular function maximization:

$k$ -submodular functions were first introduced by Huber & Kolmogorov (2012) as a generalization of bisubmodular functions, which correspond to $k=2$ . For unconstrained $k$ -submodular maximization, Iwata et al. (2015) showed that even achieving an approximation ratio $\alpha\in(\frac{k+1}{2k},1]$ is NP-hard. If the objective is monotone, there exists a deterministic algorithm achieving $1/2$ approximation (Ward & Živný, 2014) and a randomized algorithm achieving $\frac{k}{2k-1}$ approximation guarantee (Iwata et al., 2015). When the objective is non-monotone, Ward & Živný (2014) achieved $\max\{1/3,1/(1+a)\}$ approximation guarantee using $\mathcal{O}(kn)$ number of function evaluations, where $a=\max\{1,\sqrt{(k-1)/4}\}$ . It has been further improved to $1/2$ (Iwata et al., 2015) and $\frac{k^{2}+1}{2k^{2}+1}$ (Oshima, 2021).

For monotone size constraints, two types of constraints have been considered in the literature, namely total size (TS) constraints, where the number of items selected shares a common budget, and individual size (IS) constraints, where each of the $k$ types has a budget. Ohsaka & Yoshida (2015) analyzed the greedy algorithm and obtained $1/2$ and $1/3$ approximation guarantees for total size (TS) constraints and individual size (IS) constraints, respectively. Nie et al. (2023b) proposed threshold greedy algorithms for monotone TS and IS with improved query complexity.

For maximizing a monotone $k$ -submodular function under a matroid constraint, Sakaue (2017) showed the greedy algorithm can achieve $1/2$ approximation, and Matsuoka & Ohsaka (2021) proposed an algorithm that achieves $\frac{1}{1+c}$ approximation, where $c$ is the curvature. When the objective function is non-monotone, a $1/3$ approximation algorithm is presented in Sun et al. (2022). When the curvature $c$ is known, Matsuoka & Ohsaka (2021) proposed an algorithm that achieves a $\frac{1}{1+c}$ -approximation for matroid constraints and $\frac{1}{1+2c}$ -approximation for individual size constraints.

For other constraints, Tang et al. (2022); Chen et al. (2022) proposed algorithms inspired by Khuller et al. (1999); Sviridenko (2004) for knapsack constraints. Xiao et al. (2023) considered knapsack constraints under both monotone or non-monotone cases. Yu et al. (2023) considered intersection of knapsack and matroid constraints. Pham et al. (2021) proposed an algorithm for the streaming setting under knapsack constraints. While the term “streaming” often implies an online context, it is important to note the setup is distinct from CMAB problems. In streaming, an exact value oracle is available, the action space is restricted to (subsets of) the elements revealed so far, and the evaluation is after the stream complete. Here, the term “online” specifically refers to the arrival of data has some ordering.

Appendix D Applications

In this section, we give some real world appications that motivates our problem of interests.

Application 1 (Influence maximization with $k$ topics): Let $G=(V,E)$ be a social network with an edge probability $p_{u,v}^{i}$ for each edge $(u,v)\in E$ , representing the probability of $u$ influencing $v$ on the $i$ -th topic. Given a seed set $(S_{1},\cdots,S_{k})$ , the diffusion process of the rumor about the $i$ -th topic starts by activating vertices in $S_{i}$ , and propagates independently from other topics. The goal is to maximize the total influence (number of users that is influenced by at least one topic). In the well-studied models of influence propagation — the independent cascades and the linear threshold models — the influence function is monotone and $k$ -submodular (Ohsaka & Yoshida, 2015). In an online version of the problem, the underlying graph structure is not known to the agent. For each time step, the agent selects a seed set, and observes a reward representing the number of influenced users.

Application 2 (Sensor placement): We are given $k$ types of sensors, each of which provides different measurements, and we have $B_{i}$ sensors of type $i$ for each $i\in[k]$ . The ground set $V$ is a set of possible locations for the sensors. The goal is to equip each location with at most one sensor in order to maximize the total information gained from the sensors. This problem can also be modeled as $k$ -submodular maximization with an individual size constraint (Ohsaka & Yoshida, 2015). In an online version of this problem, for each time step, we select locations to place the sensors, then we collect data to calculate entropy, and in the next time step we repeat the same process by re-allocating sensors to different locations.

Application 3 (Ad allocation): We have $k$ advertisers that are known in advance and $n$ ad impressions that arrive online one at a time. Each advertiser $i\in[k]$ has a contract of $B_{i}$ impressions. For each impression $j$ and each advertiser $i$ , there is a non-negative weight $w_{ji}$ that captures how much value advertiser $i$ accrues from being allocated impression $j$ . When impression $j$ arrives, the values $\left\{w_{ji}:i\in[k]\right\}$ are revealed, and the algorithm needs to allocate impression $j$ to at most one advertiser. Letting $X_{i}$ denote the set of impressions allocated to advertiser $i$ , the total revenue is $\sum_{i=1}^{k}\max\{\sum_{j\in S_{i}}w_{ji}:S_{i}\subseteq X_{i},\left|S_{i}\right|\leq B_{i}\}$ . This problem can be formulated as a special case of $k$ -submodular maximization with a partition matroid constraint (Ene & Nguyen, 2022).

	$\displaystyle\sum_{i=1}^{k}(\hat{y}_{i}+\hat{a}_{i})p_{i}=\sum_{i=1}^{k}\hat{y}_{i}p_{i}+\sum_{i=1}^{k}\hat{a}_{i}p_{i}$
	$\displaystyle\geq\sum_{i\leq i^{}}(\hat{a}_{i^{}}-4\varepsilon)p_{i}+\sum_{i>i^{*}}(\hat{a}_{i}-4\varepsilon)p_{i}+\sum_{i=1}^{k}\hat{a}_{i}p_{i}$		(using (6) and (3))
	$\displaystyle=\sum_{i\leq i^{}}\hat{a}_{i^{}}p_{i}+\sum_{i>i^{*}}\hat{a}_{i}p_{i}+\sum_{i=1}^{k}\hat{a}_{i}p_{i}-4\varepsilon$
	$\displaystyle=\left(\sum_{i<i^{}}\hat{a}_{i^{}}p_{i}+2\hat{a}_{i^{}}p_{i^{}}\right)$
	$\displaystyle\qquad+\left(\sum_{i>i^{}}\hat{a}_{i}p_{i}+\sum_{i\neq r,i^{}}\hat{a}_{i}p_{i}+\hat{a}_{r}p_{r}\right)-4\varepsilon.$		(7)

	$\displaystyle\sum_{i>i^{}}p_{i}+\sum_{i\neq r,i^{}}p_{i}=\sum_{i^{}+1\leq i\leq i^{+}-1}p_{i}+p_{i^{+}}+\sum_{i\neq r,i^{}}p_{i}$
	$\displaystyle=\sum_{i=i^{}+1}^{i^{+}-1}\left(\frac{1}{2}\right)^{i}+\left(\frac{1}{2}\right)^{i^{+}-1}+\sum_{i\neq r,i^{}}p_{i}$		(by construction of $p_{i}$ ’s)
	$\displaystyle=\left(\frac{1}{2}\right)^{i^{}}+\sum_{i\neq r,i^{}}p_{i}$		(geometric sum)
	$\displaystyle=p_{i^{}}+\sum_{i\neq r,i^{}}p_{i}=1-p_{r}.$		(9)

$\displaystyle\sum_{i>i^{}}p_{i}+\sum_{i\neq r,i^{}}p_{i}$	$\displaystyle=\sum_{i^{}+1\leq i\leq i^{+}-1}p_{i}+p_{i^{+}}+\sum_{i\neq r,i^{}}p_{i}$
	$\displaystyle=\sum_{i=i^{}+1}^{i^{+}-1}\left(\frac{1}{2}\right)^{i}+\left(\frac{1}{2}\right)^{i^{+}-1}+\sum_{i\neq r,i^{}}p_{i}$	(by construction of $p_{i}$ ’s)
	$\displaystyle=\left(\frac{1}{2}\right)^{i^{}}+\sum_{i\neq r,i^{}}p_{i}$	(geometric sum)
	$\displaystyle=p_{i^{}}+\sum_{i\neq r,i^{}}p_{i}=1-p_{r}.$	(28)

$\displaystyle\sum_{i\neq i^{}}(\hat{a}_{i^{}}-\hat{a}_{i})\hat{y}_{i}^{t}$	$\displaystyle\leq\sum_{i\neq i^{}}(\hat{a}_{i^{}}+2\varepsilon)\hat{y}_{i}^{t}$	(using $\hat{a}_{i}\geq-2\varepsilon$ )
	$\displaystyle\leq\sum_{i\neq i^{}}(\hat{y}_{i^{}}+6\varepsilon)\hat{y}_{i}^{t}$	(using $\hat{a}_{i}-4\varepsilon\leq\hat{y}_{i}$ )
	$\displaystyle=\sum_{i\neq i^{}}\hat{y}_{i^{}}\hat{y}_{i}^{t}+6\varepsilon\sum_{i\neq i^{*}}\hat{y}_{i}^{t}$
	$\displaystyle\leq\sum_{i\neq i^{}}\hat{y}_{i^{}}\hat{y}_{i}^{t}+6\varepsilon\sum_{i\neq i^{*}}[\hat{y}_{i}]_{+}^{t}$
	$\displaystyle=\sum_{i\neq i^{}}\hat{y}_{i^{}}\hat{y}_{i}^{t}+6\varepsilon(\beta-[\hat{y}_{i^{*}}]_{+}^{t})$	(definition of $\beta$ )
	$\displaystyle\leq\sum_{i\neq i^{}}\hat{y}_{i^{}}\hat{y}_{i}^{t}+6\varepsilon\beta$
	$\displaystyle=\frac{1}{\gamma}\left(\gamma\hat{y}_{i^{}}\sum_{i\neq i^{}}\hat{y}_{i}^{t}\right)+6\varepsilon\beta$	(34)

Stochastic kk-Submodular Bandits with Full Bandit Feedback

Abstract

1 INTRODUCTION

Remark 1.1.

kk-submodular CMAB

Submodular CMAB

2 BACKGROUND

2.1 kk-Submodular Functions:

Theorem 2.1.

Matroids

2.2 CMAB

2.3 Problem Statement

2.4 Offline-to-Online Framework

Definition 2.2 ((α,δ,N)(\alpha,\delta,N)-Robust Approximation (Nie et al., 2023a)).

Theorem 2.3.

Remark 2.4.

Remark 2.5 (Offline v.s. Online Algorithms).

3 Non-monotone Functions without Constraints

3.1 Algorithm

Proposition 3.1.

3.2 Robustness

Lemma 3.2.

Proposition 3.3.

Part of proof of Proposition 3.3.

Remark 3.4.

3.3 Regret Bound

Corollary 3.5.

4 Monotone Functions without Constraints

Proposition 4.1.

Remark 4.2.

Corollary 4.3.

5 Monotone Functions with Individual Size Constraints

Proposition 5.1.

Corollary 5.2.

6 Monotone Functions with Matroid Constraints

Proposition 6.1.

Corollary 6.2.

Corollary 6.3.

Remark 6.4.

7 Non-Monotone Functions with Matroid Constraints

Proposition 7.1.

Corollary 7.2.

8 Evaluations

9 Conclusion

References

Appendix A Missing Proofs

A.1 Proof of Lemma 3.2

Lemma A.1.

Proof.

A.2 Proof of Proposition 3.3

Proof of Proposition 3.3.

Case 1:

Sub-case 1.a:

Sub-case 1.b:

Case 2:

Sub-case 2.a:

Sub-case 2.b:

Case 3:

A.3 Proof of Proposition 4.1

Proof of Proposition 4.1.

Case 1:

Case 2:

Sub-case 2.a:

Sub-case 2.b:

A.4 Proof of Proposition 5.1

Proof.

A.5 Proof of Proposition 6.1

Lemma A.2.

Lemma A.3.

Proof.

A.6 Proof of Proposition 7.1

Lemma A.4.

Proof.

Appendix B Missing Offline Algorithms

Appendix C More Related Works

Offline kk-submodular function maximization:

Appendix D Applications

Stochastic $k$ -Submodular Bandits with Full Bandit Feedback

$k$ -submodular CMAB

2.1 $k$ -Submodular Functions:

Definition 2.2 ( $(\alpha,\delta,N)$ -Robust Approximation (Nie et al., 2023a)).

Offline $k$ -submodular function maximization: