This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Using orthogonally structured positive bases for constructing positive kk-spanning sets with cosine measure guarantees

Warren Hare Department of Mathematics, University of British Columbia, Kelowna, British Columbia, Canada. Hare’s research is partially funded by the Natural Sciences and Engineering Research Council of Canada (cette recherche est partiellement financée par le Conseil de recherches en sciences naturelles et en génie du Canada), Discover Grant #2018-03865. ORCID 0000-0002-4240-3903 ([email protected]).    Gabriel Jarry-Bolduc Mathematics and Statistics Department, Saint Francis Xavier Univeristy, Antigonish, Nova Scotia, Canada. Jarry-Bolduc’s research is partially funded through the Natural Sciences and Engineering Research Council of Canada (cette recherche est partiellement financée par le Conseil de recherches en sciences naturelles et en génie du Canada), Discover Grant #2018-03865. ORCID 0000-0002-1827-8508 ([email protected]).    Sébastien Kerleau LAMSADE, CNRS, Université Paris Dauphine-PSL, Place du Maréchal de Lattre de Tassigny, 75016 Paris, France ([email protected]).    Clément W. Royer LAMSADE, CNRS, Université Paris Dauphine-PSL, Place du Maréchal de Lattre de Tassigny, 75016 Paris, France. Royer’s research is partially funded by Agence Nationale de la Recherche through program ANR-19-P3IA-0001 (PRAIRIE 3IA Institute). ORCID 0000-0003-2452-2172 ([email protected]). Research for this paper is also supported by France-Canada Research Funds 2022.
Abstract

Positive spanning sets span a given vector space by nonnegative linear combinations of their elements. These have attracted significant attention in recent years, owing to their extensive use in derivative-free optimization. In this setting, the quality of a positive spanning set is assessed through its cosine measure, a geometric quantity that expresses how well such a set covers the space of interest. In this paper, we investigate the construction of positive kk-spanning sets with geometrical guarantees. Our results build on recently identified positive spanning sets, called orthogonally structured positive bases. We first describe how to identify such sets and compute their cosine measures efficiently. We then focus our study on positive kk-spanning sets, for which we provide a complete description, as well as a new notion of cosine measure that accounts for the resilient nature of such sets. By combining our results, we are able to use orthogonally structured positive bases to create positive kk-spanning sets with guarantees on the value of their cosine measures.

Keywords

Positive spanning sets; Positive kk-spanning sets; Positive bases; Positive kk-bases; Cosine measure; kk-cosine measure; Derivative-free optimization.

2020 Mathematics Subject Classification

15A03; 15A21; 15B30; 15B99; 90C56.

1 Introduction

Positive spanning sets, or PSSs, are sets of vectors that span a space of interest through nonnegative linear combinations. These sets were first investigated by Davis [4], with a particular emphasis on inclusion-wise minimal PSSs, also called positive bases [4, 15]. Such a restriction guarantees bounds on the minimal and maximal size of a positive basis [1, 4]. PSSs of this form are now well understood in the literature [18]. Any positive basis is amenable to a decomposition into minimal positive bases over subspaces [21], a result that recently lead to the characterization of nicely structured positive bases [7], hereafter called orthogonally structured positive bases, or OSPBs.

Positive spanning sets and positive bases are now a standard tool to develop derivative-free optimization algorithms [5, 3]. The idea is to use directions from a PSS in replacement for derivative information. In that setting, it becomes critical to quantify how well directions from a PSS cover the space of interest, which is typically assessed through the cosine measure of this PSS [11, 10]. In general, no closed form expression is available for this cosine measure, yet various computing techniques have been proposed [6, 19]. In particular, a deterministic algorithm to compute the cosine measure of any PSS was recently proposed by Hare and Jarry-Bolduc [6]. The cosine measure of certain positive bases is known, along with upper bounds for positive bases of minimal and maximal sizes [16, 18]. The intermediate size case is less understood, even though recent progress was made in the case of orthogonally structured positive bases [7].

Meanwhile, another category of positive spanning sets introduced in the 1970s has received little attention in the literature. Those sets, called positive kk-spanning sets (PkkSSs), can be viewed as resilient PSSs, since at least kk of their elements must be removed in order to make them lose their positively spanning property [12]. Unlike standard PSSs, only partial results are known regarding inclusion-wise minimal PkkSS. These results depart from those for PSSs as they rely on polytope theory [13, 22]. Moreover, the construction of PkkSSs based on PSSs has not been fully explored. To the best of our knowledge, the cosine measure of PkkSSs has not been investigated, which partly prevents those sets from being used in derivative-free algorithms.

In this paper, we investigate positive kk-spanning sets through the lens of orthogonally structured positive bases and cosine measure. To this end, we refine previous results on OSPBs so as to obtain an efficient cosine measure calculation technique. We then explain how the notion of cosine measure can be generalized to account for the positive kk-spanning property, thereby introducing a quantity called the kk-cosine measure. To the best of our knowledge, this definition is new in both the derivative-free optimization and the positive spanning set literature. By combining those elements, we are able to build positive kk-spanning sets from OSPBs as well as to provide a bound on their kk-cosine measures. Our results pave the way for using positive kk-spanning sets in derivative-free algorithms.

The rest of this paper is organized as follows. Section 2 summarizes important properties of positive spanning sets, including their characterization through the cosine measure, with a particular focus on the subclass of orthogonally structured positive bases (OSPBs). Section 3 describes an efficient way to compute the cosine measure of an OSPB based on leveraging its decomposition. Section 4 formalizes the main properties associated with positive kk-spanning sets, introduces the kk-cosine measure to help studying these sets and uses OSPBs to design PkkSSs with guaranteed kk-cosine measure. Section 5 summarizes our findings and provides several perspectives of our work.

Notations

Throughout this paper, we will work in the Euclidean space n\mathbb{R}^{n} with n2n\geq 2, or a linear subspace thereof, denoted by 𝕃n\mathbb{L}\subset\mathbb{R}^{n}. More generally, blackboard bold letters will be used to designate infinite-valued sets, such as linear (sub)spaces and half-spaces. The Minkowski sum {𝐚+𝐛,(𝐚,𝐛)𝔸×𝔹}\left\{\mathbf{a}+\mathbf{b},(\mathbf{a},\mathbf{b})\in\mathbb{A}\times\mathbb{B}\right\} of two spaces 𝔸\mathbb{A} and 𝔹\mathbb{B} will be denoted by 𝔸+𝔹\mathbb{A}+\mathbb{B} or by 𝔸𝔹\mathbb{A}\perp\mathbb{B} if the two spaces are orthogonal to one another. The orthogonal to a given subspace 𝔸\mathbb{A} will be denoted accordingly by 𝔸\mathbb{A}^{\perp}. In order to allow for repetition among their elements, positive spanning sets will be seen as families of vectors. We will use calligraphic letters such as 𝒟\mathcal{D} to denote finite families of vectors. Given a family 𝒟n\mathcal{D}\subset\mathbb{R}^{n}, the linear span of this family (i.e.i.e. the set of linear combinations of its elements) will be denoted by span(𝒟)\operatorname{span}(\mathcal{D}). Bold lowercase (resp. uppercase) letters will be used to designate vectors (resp. matrices). The notations 𝟎n\mathbf{0}_{n} and 𝟏n\mathbf{1}_{n} will respectively be used to designate the null vector and the all-ones vector in n\mathbb{R}^{n}, while n:={𝐞1,,𝐞n}\mathcal{I}_{n}:=\{\mathbf{e}_{1},\cdots,\mathbf{e}_{n}\} will denote the family formed by coordinate basis vectors in n\mathbb{R}^{n}. Given a family 𝒟a\mathcal{D}_{a} of vectors in n\mathbb{R}^{n}, we will use 𝐃a\mathbf{D}_{a} to denote a matrix with nn rows and whose columns correspond to the elements in 𝒟a\mathcal{D}_{a}, in no particular order unless otherwise stated. We will use 𝒟a{𝐝}\mathcal{D}_{a}\setminus\{\mathbf{d}\} to denote a family obtained from 𝒟a\mathcal{D}_{a} by removing exactly one instance of the vector 𝐝\mathbf{d}. Similarly, the notation 𝒟a{𝐝}\mathcal{D}_{a}\cup\{\mathbf{d}\} will designate the family obtained from 𝒟a\mathcal{D}_{a} by adding one copy of the vector 𝐝.\mathbf{d}. For instance, {𝐝,𝐝,𝐝}{𝐝}={𝐝,𝐝}\{\mathbf{d},\mathbf{d},\mathbf{d}^{\prime}\}\setminus\{\mathbf{d}\}=\{\mathbf{d},\mathbf{d}^{\prime}\} and {𝐝,𝐝}{𝐝}={𝐝,𝐝,𝐝}\{\mathbf{d},\mathbf{d}^{\prime}\}\cup\{\mathbf{d}\}=\{\mathbf{d},\mathbf{d},\mathbf{d}^{\prime}\}. Finally, the notation [[1,m]][\![1,m]\!] will refer to the set {1,,m}\{1,\dots,m\}.

2 Background on positive spanning sets

In this section, we introduce the main concepts and results on positive spanning sets that will be used throughout the paper. Section 2.1 defines positive spanning sets and positive bases. Section 2.2 introduces the concept of orthogonally structured positive bases, that were first studied under a different name [7]. Section 2.3 provides the definition of the cosine measure of a positive spanning set, along with its value for several commonly used positive bases.

2.1 Positive spanning sets and positive bases

In this section, we recall the definitions of positive spanning sets and positive bases, based on the seminal paper of Davis [4].

Definition 2.1 (Positive span and positive spanning set)

Let 𝕃\mathbb{L} be a linear subspace of n\mathbb{R}^{n} and m1m\geq 1. The positive span of a family of vectors 𝒟={𝐝1,𝐝2,,𝐝m}\mathcal{D}=\left\{\mathbf{d}_{1},\mathbf{d}_{2},\dots,\mathbf{d}_{m}\right\} in 𝕃\mathbb{L}, denoted pspan(𝒟)\operatorname{pspan}(\mathcal{D}), is the set

pspan(𝒟):={𝐮𝕃:𝐮=α1𝐝1++αm𝐝m | i[[1,m]],αi0}.\operatorname{pspan}(\mathcal{D}):=\{\mathbf{u}\in\mathbb{L}:\mathbf{u}=\alpha_{1}\mathbf{d}_{1}+\cdots+\alpha_{m}\mathbf{d}_{m}\text{ }|\text{ }\forall i\in[\![1,m]\!],\alpha_{i}\geq 0\}.

A positive spanning set (PSS) of 𝕃\mathbb{L} is a family of vectors 𝒟\mathcal{D} such that pspan(𝒟)=𝕃\operatorname{pspan}(\mathcal{D})=\mathbb{L}.

When 𝕃\mathbb{L} is not specified, a positive spanning set is understood as positively spanning n\mathbb{R}^{n}. Throughout the paper, we only consider positive spanning sets formed of nonzero vectors. Note, however, that we do not force all elements of a positive spanning set to be distinct. The next two lemmas are well known and describe basic properties of positive spanning sets.

Lemma 2.1

[18, Theorem 2.5] Let 𝒟\mathcal{D} be a finite set of vectors in a subspace 𝕃\mathbb{L} of n\mathbb{R}^{n}. The following statements are equivalent:

  1. (i)

    𝒟\mathcal{D} is a PSS of 𝕃\mathbb{L}.

  2. (ii)

    For every nonzero vector 𝐮𝕃\mathbf{u}\in\mathbb{L}, there exists an element 𝐝𝒟\mathbf{d}\in\mathcal{D} such that 𝐮𝐝>0\mathbf{u}^{\top}\mathbf{d}>0.

  3. (iii)

    span(𝒟)=𝕃span(\mathcal{D})=\mathbb{L} and 𝟎n\mathbf{0}_{n} can be written as a positive linear combination of the elements of 𝒟\mathcal{D}.

Lemma 2.2

[4, Theorem 3.7] If 𝒟\mathcal{D} is a PSS of 𝕃\mathbb{L}, then for any 𝐝𝒟\mathbf{d}\in\mathcal{D} the set 𝒟\{𝐝}\mathcal{D}\backslash\{\mathbf{d}\} linearly spans 𝕃\mathbb{L}.

Note that Lemma 2.2 implies that a PSS must contain at least dim(𝕃)+1\dim(\mathbb{L})+1 vectors. There is no upper bound on the number of elements that a PSS may contain, but such a restriction holds for a subclass of positive spanning sets called positive bases.

Definition 2.2 (Positive basis)

Let 𝕃\mathbb{L} be a linear subspace of n\mathbb{R}^{n} of dimension 1\ell\geq 1. A positive basis of 𝕃\mathbb{L} of size +s\ell+s, denoted 𝒟𝕃,s\mathcal{D}_{\mathbb{L},s}, is a PSS of 𝕃\mathbb{L} with +s\ell+s elements satisfying:

𝐝𝒟𝕃,s,pspan(𝒟𝕃,s\{𝐝})𝕃.\forall\mathbf{d}\in\mathcal{D}_{\mathbb{L},s},\quad\operatorname{pspan}{(\mathcal{D}_{\mathbb{L},s}\backslash\{\mathbf{d}\})}\neq\mathbb{L}.

When 𝕃=n\mathbb{L}=\mathbb{R}^{n}, such a set will be denoted by 𝒟n,s\mathcal{D}_{n,s}.

In other words, positive bases of 𝕃\mathbb{L} are inclusion-wise minimal positive spanning sets of 𝕃\mathbb{L} [3]. Positive bases can also be defined thanks to the notion of positive independence.

Definition 2.3 (Positive independence)

Let 𝕃\mathbb{L} be a linear subspace of n\mathbb{R}^{n} of dimension 1\ell\geq 1. A family of vectors 𝒟\mathcal{D} in 𝕃\mathbb{L} is positively independent if, for any 𝐝𝒟\mathbf{d}\in\mathcal{D}, there exists a vector 𝐮𝕃\mathbf{u}\in\mathbb{L} such that 𝐮𝐝>0\mathbf{u}^{\top}\mathbf{d}>0 and 𝐮𝐯0\mathbf{u}^{\top}\mathbf{v}\leq 0 for any 𝐯𝒟{𝐝}\mathbf{v}\in\mathcal{D}\setminus\{\mathbf{d}\}.

A positive basis of 𝕃\mathbb{L} is thus a positively independent PSS of 𝕃\mathbb{L}.In the case 𝕃=n\mathbb{L}=\mathbb{R}^{n}, we simply say that 𝒟n,s\mathcal{D}_{n,s} is a positive basis (of size n+sn+s).

One can show that the size of a positive basis of a subspace 𝕃\mathbb{L} is at least dim(𝕃)+1\dim(\mathbb{L})+1 and at most 2dim(𝕃)2\dim(\mathbb{L}) [2, 4].

Definition 2.4

Let 𝕃\mathbb{L} be a linear subspace of n\mathbb{R}^{n} of dimension 1\ell\geq 1 and 1s1\leq s\leq\ell. A positive basis 𝒟𝕃,s\mathcal{D}_{\mathbb{L},s} of 𝕃\mathbb{L} is called minimal if s=1s=1, in which case we denote it by 𝒟𝕃=𝒟𝕃,1\mathcal{D}_{\mathbb{L}}=\mathcal{D}_{\mathbb{L},1}. The positive basis is called maximal if s=s=\ell. If 1<s<1<s<\ell, we say that the positive basis has intermediate size.

Maximal and minimal positive bases have a well-understood structure [2, 17]. The structure of an arbitrary positive basis can however be quite complicated to analyze. In the next section, we describe a decomposition formula for positive bases that identifies a favorable structure.

2.2 Orthogonally structured positive bases

A complete characterization of positive bases can be provided through a subspace decomposition [21]. Such a decomposition is based upon the concept of critical vectors defined below.

Definition 2.5 (Critical vectors)

Let 𝕃\mathbb{L} be a subspace of n\mathbb{R}^{n} and 𝒟𝕃,s\mathcal{D}_{\mathbb{L},s} be a positive basis of 𝕃\mathbb{L}. A vector 𝐜𝕃\mathbf{c}\in\mathbb{L} is called a critical vector of 𝒟𝕃,s\mathcal{D}_{\mathbb{L},s} if it cannot replace an element of 𝒟𝕃,s\mathcal{D}_{\mathbb{L},s} to form a positive spanning set, i.e.

𝐝𝒟𝕃,s,pspan((𝒟𝕃,s{𝐝}){𝐜})𝕃.\forall\mathbf{d}\in\mathcal{D}_{\mathbb{L},s},\quad\operatorname{pspan}\left((\mathcal{D}_{\mathbb{L},s}\setminus\{\mathbf{d}\})\cup\{\mathbf{c}\}\right)\neq\mathbb{L}.

Note that the zero vector is a critical vector for every positive basis of 𝕃\mathbb{L}, and that it is the only critical vector for a maximal positive basis [21]. Moreover, if 𝒟𝕃\mathcal{D}_{\mathbb{L}} is a minimal positive basis and dim(𝕃)2\dim(\mathbb{L})\geq 2, the set of critical vectors (known as the complete critical set) is given by

ijpspan(𝒟𝕃{𝐝i,𝐝j}).-\bigcup_{i\neq j}\operatorname{pspan}\left(\mathcal{D}_{\mathbb{L}}\setminus{\{\mathbf{d}_{i},\mathbf{d}_{j}\}}\right).

Using critical vectors, one may decompose any positive basis as a union of minimal positive bases. Theorem 2.1 gives the result for a positive basis in n\mathbb{R}^{n}, by adapting the original decomposition result due to Romanowicz [21, Theorem 1] (see also [7, Lemma 25]).

Theorem 2.1 (Structure of a positive basis)

Suppose that n2n\geq 2, s1s\geq 1, and consider a positive basis 𝒟n,s\mathcal{D}_{n,s} of n\mathbb{R}^{n}. Then, either s=1s=1 or there exist subspaces 𝕃1,,𝕃s\mathbb{L}_{1},\dots,\mathbb{L}_{s} of n\mathbb{R}^{n} such that n=𝕃1+𝕃2++𝕃s\mathbb{R}^{n}=\mathbb{L}_{1}+\mathbb{L}_{2}+\cdots+\mathbb{L}_{s} and 𝕃i𝕃j={𝟎}\mathbb{L}_{i}\cap\mathbb{L}_{j}=\{\mathbf{0}\} for any iji\neq j, and there exist ss associated minimal positive bases 𝒟𝕃1,,𝒟𝕃s\mathcal{D}_{\mathbb{L}_{1}},\dots,\mathcal{D}_{\mathbb{L}_{s}} such that

𝒟n,s=𝒟𝕃1(𝒟𝕃2+𝐜1)(𝒟𝕃s+𝐜s1),\mathcal{D}_{n,s}\;=\;\mathcal{D}_{\mathbb{L}_{1}}\cup(\mathcal{D}_{\mathbb{L}_{2}}+\mathbf{c}_{1})\cup\cdots\cup(\mathcal{D}_{\mathbb{L}_{s}}+\mathbf{c}_{s-1}), (1)

where for any j=1,,s1j=1,\dots,s-1, we let 𝒟𝕃j+1+𝐜j:={𝐝+𝐜j|𝐝𝒟𝕃j+1}\mathcal{D}_{\mathbb{L}_{j+1}}+\mathbf{c}_{j}:=\left\{\mathbf{d}+\mathbf{c}_{j}\ \middle|\ \mathbf{d}\in\mathcal{D}_{\mathbb{L}_{j+1}}\right\}, and the vector 𝐜j𝕃1++𝕃j\mathbf{c}_{j}\in\mathbb{L}_{1}+\cdots+\mathbb{L}_{j} is a critical vector for 𝒟𝕃1i=2j(𝒟𝕃i+𝐜i1)\mathcal{D}_{\mathbb{L}_{1}}\cup\bigcup\limits_{i=2}^{j}(\mathcal{D}_{\mathbb{L}_{i}}+\mathbf{c}_{i-1}).

The result of Theorem 2.1 is actually an equivalence, in that any set that admits the decomposition (1) is a non-minimal positive basis of n\mathbb{R}^{n} [21, Theorem 1]. The example below provides an illustration for both Definition 2.5 and Theorem 2.1. One can easily check that the stated vector is indeed critical, and that the proposed decomposition matches (1).

Example 2.1

Consider the following positive basis of 4\mathbb{R}^{4}:

𝒟4,2={[1000],[0100],[1122],[1144][0010],[0001]}.\mathcal{D}_{4,2}=\left\{\begin{bmatrix}1\\ 0\\ 0\\ 0\end{bmatrix},\begin{bmatrix}0\\ 1\\ 0\\ 0\end{bmatrix},\begin{bmatrix}-1\\ -1\\ 2\\ 2\end{bmatrix},\begin{bmatrix}1\\ 1\\ -4\\ -4\end{bmatrix}\begin{bmatrix}0\\ 0\\ 1\\ 0\end{bmatrix},\begin{bmatrix}0\\ 0\\ 0\\ 1\end{bmatrix}\right\}.

The first four vectors of this set form a minimal positive basis for {[xyzz],(x,y,z)3}\{[x\ y\ z\ z]^{\top},(x,y,z)\in\mathbb{R}^{3}\} that admits [0 0 0.5 0.5][0\ 0\ 0.5\ 0.5]^{\top} as a critical vector. Therefore, a decomposition of the form (1) for 𝒟4,2\mathcal{D}_{4,2} is:

𝒟4,2={[1000],[0100],[1122],[1144]}({[000.50.5],[000.50.5]}+{[000.50.5]}).\mathcal{D}_{4,2}=\left\{\begin{bmatrix}1\\ 0\\ 0\\ 0\end{bmatrix},\begin{bmatrix}0\\ 1\\ 0\\ 0\end{bmatrix},\begin{bmatrix}-1\\ -1\\ 2\\ 2\end{bmatrix},\begin{bmatrix}1\\ 1\\ -4\\ -4\end{bmatrix}\right\}\bigcup\left(\left\{\begin{bmatrix}0\\ 0\\ 0.5\\ -0.5\end{bmatrix},\begin{bmatrix}0\\ 0\\ -0.5\\ 0.5\end{bmatrix}\right\}+\left\{\begin{bmatrix}0\\ 0\\ 0.5\\ 0.5\end{bmatrix}\right\}\right).

In general, however, the decomposition (1) is hard to compute, as it requires to determine the critical vectors and the subspaces 𝕃i\mathbb{L}_{i}, that need not be orthogonal to one another. These considerations have lead researchers to consider a subclass of positive bases with a nicer decomposition [7], leading to Definition 2.6 below.

Definition 2.6 (Orthogonally structured positive bases)

Suppose that n2n\geq 2, and consider a positive basis 𝒟n,s\mathcal{D}_{n,s} of n\mathbb{R}^{n}. If there exist subspaces 𝕃1,,𝕃s\mathbb{L}_{1},\dots,\mathbb{L}_{s} of n\mathbb{R}^{n} such that n=𝕃1𝕃2𝕃s\mathbb{R}^{n}=\mathbb{L}_{1}\perp\mathbb{L}_{2}\perp\cdots\perp\mathbb{L}_{s} and associated minimal positive bases 𝒟𝕃1,,𝒟𝕃s\mathcal{D}_{\mathbb{L}_{1}},\dots,\mathcal{D}_{\mathbb{L}_{s}} such that

𝒟n,s=𝒟𝕃1𝒟𝕃2𝒟𝕃s,\mathcal{D}_{n,s}\;=\;\mathcal{D}_{\mathbb{L}_{1}}\cup\mathcal{D}_{\mathbb{L}_{2}}\cup\cdots\cup\mathcal{D}_{\mathbb{L}_{s}}, (2)

then 𝒟n,s\mathcal{D}_{n,s} is called an orthogonally structured positive basis, or OSPB.

By construction, note that for any 1i<js1\leq i<j\leq s, any pair of elements (𝐝,𝐝)𝒟𝕃i×𝒟𝕃j(\mathbf{d},\mathbf{d}^{\prime})\in\mathcal{D}_{\mathbb{L}_{i}}\times\mathcal{D}_{\mathbb{L}_{j}} satisfies 𝐝𝐝=0\mathbf{d}^{\top}\mathbf{d}^{\prime}=0.

The class of orthogonally structured positive bases includes common positive bases, such as minimal ones and certain maximal positive bases such as those formed by the coordinate vectors and their negatives in n\mathbb{R}^{n}. More OSPBs can be obtained via numerical procedures, even for intermediate sizes [7]. As will be shown in Section 3, it is also possible to compute their cosine measure in an efficient manner.

2.3 The cosine measure

To end this background section, we define the cosine measure, a metric associated with a given family of vectors.

Definition 2.7 (Cosine measure and cosine vector set)

Let 𝒟\mathcal{D} be a nonempty family of nonzero vectors in n\mathbb{R}^{n}. The cosine measure of 𝒟\mathcal{D} is given by

cm(𝒟)=min𝐮=1𝐮nmax𝐝𝒟𝐮𝐝𝐝.\operatorname{cm}\left(\mathcal{D}\right)=\min_{\begin{subarray}{c}\|\mathbf{u}\|=1\\ \mathbf{u}\in\mathbb{R}^{n}\end{subarray}}\max_{\mathbf{d}\in\mathcal{D}}\frac{\mathbf{u}^{\top}\mathbf{d}}{\|\mathbf{d}\|}.

The cosine vector set associated with 𝒟\mathcal{D} is given by

c𝒱(𝒟)=argmin𝐮=1𝐮nmax𝐝𝒟𝐮𝐝𝐝.c\mathcal{V}(\mathcal{D})=\underset{\begin{subarray}{c}\|\mathbf{u}\|=1\\ \mathbf{u}\in\mathbb{R}^{n}\end{subarray}}{\operatorname{arg\,min}}\max_{\mathbf{d}\in\mathcal{D}}\frac{\mathbf{u}^{\top}\mathbf{d}}{\|\mathbf{d}\|}.

The cosine measure and cosine vector set provide insights on the geometry of the elements of 𝒟\mathcal{D} in the space. Such concepts are of particular interest in the case of positive spanning sets, as shown by the result of Proposition 2.1. Its proof can be found in key references in derivative-free optimization [2, 3], but note that our analysis in Section 4 provides a more general proof in the context of positive kk-spanning sets.

Proposition 2.1

Let 𝒟\mathcal{D} be a nonempty, finite family of vectors in n\mathbb{R}^{n}. Then 𝒟\mathcal{D} is a positive spanning set in n\mathbb{R}^{n} if and only if cm(𝒟)>0\operatorname{cm}\left(\mathcal{D}\right){}>0.

Proposition 2.1 shows that the positive spanning property can be checked by computing the cosine measure. The value of the cosine measure further characterizes how well that set covers the space, which is a relevant information in derivative-free algorithms [10]. Both observations motivate the search for efficient techniques to compute the cosine measure.

We end this section by providing several examples of cosine measures for orthogonally structured positive bases, that form the core interest of this paper. We focus on building those positive bases from the coordinate vectors. A classical choice for a maximal positive basis consists in selecting the coordinate vectors and their negatives. In that case, it is known [16] that

cm(nn)=cm({𝐞1,,𝐞n,𝐞1,,𝐞n})=1n.\operatorname{cm}\left(\mathcal{I}_{n}\cup-\mathcal{I}_{n}\right)=\operatorname{cm}\left(\left\{\mathbf{e}_{1},\dots,\mathbf{e}_{n},-\mathbf{e}_{1},\dots,-\mathbf{e}_{n}\right\}\right){}=\frac{1}{\sqrt{n}}.

One can also consider a minimal positive basis formed by the coordinate vectors and the sum of their negatives. Then, we have

cm(n{i=1n𝐞i})=cm({𝐞1,,𝐞n,i=1n𝐞i})=1n2+2(n1)n.\operatorname{cm}\left(\mathcal{I}_{n}\cup\left\{-\sum_{i=1}^{n}\mathbf{e}_{i}\right\}\right)=\operatorname{cm}\left(\left\{\mathbf{e}_{1},\dots,\mathbf{e}_{n},-\sum_{i=1}^{n}\mathbf{e}_{i}\right\}\right){}=\frac{1}{\sqrt{n^{2}+2(n-1)\sqrt{n}}}.

To the best of our knowledge, the latter formula has only been recently established in [9, 20], and no formal proof is available in the literature. In the next section, we will develop an efficient cosine measure calculation technique for orthogonally structured positive bases that will provide a proof of this result as a byproduct.

3 Computing OSPBs and their cosine measure

In this section, we describe how orthogonally structured positive bases can be identified using the decomposition introduced in the previous section. We also leverage this decomposition to design algorithms for detecting the OSPB structure and computing the cosine measure of a given OSPB.

3.1 Structure and detection of OSPBs

The favorable structure of an OSPB can be revealed through the properties of its matrix representations, and in particular the Gram matrices associated with an OSPB, in the sense of Definition 3.1 below.

Definition 3.1 (Gram matrix)

Let 𝒮\mathcal{S} be a finite family of mm vectors in n\mathbb{R}^{n} and 𝐒n×m\mathbf{S}\in\mathbb{R}^{n\times m} a matrix representation of this family. The Gram matrix of 𝒮\mathcal{S} associated with 𝐒\mathbf{S} is the matrix 𝐆(𝐒)=𝐒𝐒\mathbf{G}(\mathbf{S})=\mathbf{S}^{\top}\mathbf{S}.

Given a Gram matrix 𝐆(𝐒)\mathbf{G}(\mathbf{S}) of 𝒮\mathcal{S}, any Gram matrix of 𝒮\mathcal{S} has the form 𝐏𝐆(𝐒)𝐏\mathbf{P}^{\top}\mathbf{G}(\mathbf{S})\mathbf{P} where 𝐏\mathbf{P} is a permutation matrix. We will show that when 𝒮\mathcal{S} is an OSPB, there exists a Gram matrix with a block-diagonal structure that reveals the decomposition of the basis, and thus its OSPB nature.

Theorem 3.1

Let n2n\geq 2, s2s\geq 2 and 𝒟n,sn\mathcal{D}_{n,s}\subset\mathbb{R}^{n} be a positive basis. Then, 𝒟n,s\mathcal{D}_{n,s} is orthogonally structured if and only if one of its Gram matrices 𝐆(𝒟n,s)\mathbf{G}(\mathcal{D}_{n,s}) can be written as a block diagonal matrix with ss diagonal blocks.

Proof. Suppose first that 𝒟n,s\mathcal{D}_{n,s} is an OSPB associated to the decomposition

𝒟n,s=𝒟𝕃1𝒟𝕃2𝒟𝕃s.\mathcal{D}_{n,s}=\mathcal{D}_{\mathbb{L}_{1}}\cup\mathcal{D}_{\mathbb{L}_{2}}\cup\cdots\cup\mathcal{D}_{\mathbb{L}_{s}}.

Let 𝐃n,s\mathbf{D}_{n,s} be a matrix representation of 𝒟n,s\mathcal{D}_{n,s} such that 𝐃n,s=[𝐃𝕃1𝐃𝕃s]\mathbf{D}_{n,s}=\begin{bmatrix}\mathbf{D}_{\mathbb{L}_{1}}&\cdots&\mathbf{D}_{\mathbb{L}_{s}}\end{bmatrix}, where 𝐃𝕃i\mathbf{D}_{\mathbb{L}_{i}} is a matrix representation of 𝒟𝕃i\mathcal{D}_{\mathbb{L}_{i}} for every i[[1,s]]i\in[\![1,s]\!]. By orthogonality of those matrices, the Gram matrix 𝐆(𝐃n,s)=𝐃n,s𝐃n,s\mathbf{G}(\mathbf{D}_{n,s})=\mathbf{D}_{n,s}^{\top}\mathbf{D}_{n,s} is then block diagonal with ss blocks corresponding to 𝐆(𝐃𝕃1),,𝐆(𝐃𝕃s)\mathbf{G}(\mathbf{D}_{\mathbb{L}_{1}}),\dots,\mathbf{G}(\mathbf{D}_{\mathbb{L}_{s}}), and thus the desired result holds.

Conversely, suppose that there exists a matrix representation 𝐃n,s\mathbf{D}_{n,s} of 𝒟n,s\mathcal{D}_{n,s} such that 𝐆(𝐃n,s)\mathbf{G}(\mathbf{D}_{n,s}) is block diagonal with ss blocks. Then, by considering a partitioning with respect to those diagonal blocks, one can decompose 𝐃n,s\mathbf{D}_{n,s} into [𝐒1𝐒s]\begin{bmatrix}\mathbf{S}_{1}&\cdots&\mathbf{S}_{s}\end{bmatrix} so that 𝐆(𝐃n,s)=diag(𝐆(𝐒1),,𝐆(𝐒s))\mathbf{G}(\mathbf{D}_{n,s})=\operatorname{diag}\left(\mathbf{G}(\mathbf{S}_{1}),\dots,\mathbf{G}(\mathbf{S}_{s})\right). By definition of the Gram matrix, this structure implies that the columns of 𝐒i\mathbf{S}_{i} and 𝐒j\mathbf{S}_{j} are orthogonal for any iji\neq j, thus we only need to show that each 𝐒i\mathbf{S}_{i} is a minimal positive basis for its linear span. To this end, we use the fact that 𝒟n,s\mathcal{D}_{n,s} is positively spanning. By Lemma 2.1(ii), there exists a vector 𝐮n+s\mathbf{u}\in\mathbb{R}^{n+s} with positive coefficients such that 𝐃n,s𝐮=𝟎n\mathbf{D}_{n,s}\mathbf{u}=\mathbf{0}_{n}. By decomposing the vector 𝐮\mathbf{u} into ss vectors 𝐮1,,𝐮s\mathbf{u}_{1},\dots,\mathbf{u}_{s} according to the decomposition [𝐒1𝐒s]\begin{bmatrix}\mathbf{S}_{1}&\cdots&\mathbf{S}_{s}\end{bmatrix}, we obtain that

𝐃n,s𝐮=i=1n𝐒i𝐮i=𝟎n.\mathbf{D}_{n,s}\mathbf{u}=\sum_{i=1}^{n}\mathbf{S}_{i}\mathbf{u}_{i}=\mathbf{0}_{n}. (3)

By orthogonality of the columns of the 𝐒i\mathbf{S}_{i} matrices, the property (3) is equivalent to

𝐒i𝐮i=𝟎ni[[1,s]].\mathbf{S}_{i}\mathbf{u}_{i}=\mathbf{0}_{n}\ \forall i\in[\![1,s]\!].

Using again Lemma 2.1(ii), this latter property is equivalent to 𝒮i\mathcal{S}_{i} being a PSS for its linear span. Let nin_{i} be the dimension of this span, and mim_{i} the number of elements in 𝒮i\mathcal{S}_{i}. Since the 𝐒i\mathbf{S}_{i} are orthogonal and the columns of 𝐃n,s\mathbf{D}_{n,s} span n\mathbb{R}^{n}, we must have i=1sni=n\sum_{i=1}^{s}n_{i}=n. In addition, we also have i=1smi=n+s=i=1sni+s\sum_{i=1}^{s}m_{i}=n+s=\sum_{i=1}^{s}n_{i}+s. Since mi>nim_{i}>n_{i} for all i[[1,s]]i\in[\![1,s]\!], we conclude that mi=ni+1m_{i}=n_{i}+1, and thus every 𝒮i\mathcal{S}_{i} is a minimal positive basis. \Box

Note that the characterization stated by Theorem 3.1 trivially holds for minimal positive bases. This theorem thus provides a characterization of the OSPB property through Gram matrices. One can then use this result to determine whether a positive basis has an orthogonal structure and, in that event, to highlight the associated decomposition.

Corollary 3.1

Let n2n\geq 2, s1s\geq 1 and 𝒟n,s\mathcal{D}_{n,s} be a positive basis in n\mathbb{R}^{n}. Let 𝐃n,s\mathbf{D}_{n,s} be a matrix representation of 𝒟n,s\mathcal{D}_{n,s}. If there exists a permutation matrix 𝐏(n+s)×(n+s)\mathbf{P}\in\mathbb{R}^{(n+s)\times(n+s)} such that the Gram matrix 𝐆(𝐃n,s𝐏)\mathbf{G}(\mathbf{D}_{n,s}\mathbf{P}) is block diagonal with ss blocks, then 𝒟n,s\mathcal{D}_{n,s} is an OSPB, and its decomposition (2) is given by the columns of 𝐃n,s𝐏\mathbf{D}_{n,s}\mathbf{P} in that order.

Corollary 3.1 provides a principled way of detecting the OSPB structure, by checking all possible permutations of the elements in the basis. Although this is not the focus of this work, we remark that it can be done in an efficient manner by considering a graph whose Laplacian matrix 𝐋\mathbf{L} has non-zero coordinates in the exact same rows and columns as 𝐆(𝐃n,s)\mathbf{G}(\mathbf{D}_{n,s}). Indeed, the problem of finding the permutation for matrix 𝐋\mathbf{L} reduces to that of finding the number of connected components in the graph [14] and can be solved in polynomial time [8].

3.2 Cosine measure of an orthogonally structured positive basis

As seen in the previous section, orthogonally structured positive bases admit a decomposition into minimal positive bases that are orthogonal to one another. In this section, we investigate how this particular structure allows for efficient computation of the cosine measure.

Our approach builds on the algorithm introduced by Hare and Jarry-Bolduc [6], that computes the cosine measure of any positive spanning set in finite time (though the computation time may be exponential in the problem dimension). This procedure consists of a loop over all possible linear bases contained in the positive spanning set. More formally, given a positive spanning set 𝒟\mathcal{D} in n\mathbb{R}^{n} and a linear basis 𝒟\mathcal{B}\subset\mathcal{D}, we let 𝐃\mathbf{D} be a matrix representation of 𝒟\mathcal{D}, and 𝐁\mathbf{B} the associated representation of \mathcal{B}. One then defines

γ:=1𝟏n𝐆(𝐁)1𝟏nand𝐮:=γ𝐁𝟏n.\gamma_{\mathcal{B}}:=\frac{1}{\sqrt{\mathbf{1}_{n}^{\top}\mathbf{G}(\mathbf{B})^{-1}\mathbf{1}_{n}}}\quad\mbox{and}\quad\mathbf{u}_{\mathcal{B}}:=\gamma_{\mathcal{B}}\mathbf{B}^{-\top}\mathbf{1}_{n}. (4)

The vector 𝐮\mathbf{u}_{\mathcal{B}} is the only unitary vector such that 𝐮𝐝𝐝=γ\mathbf{u}_{\mathcal{B}}^{\top}\tfrac{\mathbf{d}}{\|\mathbf{d}\|}=\gamma_{\mathcal{B}} for all 𝐝\mathbf{d}\in\mathcal{B} [6, Lemmas 12 and 13]. Computing the quantities (4) for all linear bases contained in 𝒟\mathcal{D} then gives both the cosine measure and the cosine vector set for 𝒟\mathcal{D} [6, Theorem 19]. Indeed, we have

cm(𝒟)=min𝒟basisofnmax𝐝𝒟𝐮𝐝𝐝,\operatorname{cm}\left(\mathcal{D}\right)=\min_{\begin{subarray}{c}\mathcal{B}\,\subset\,\mathcal{D}\\ \mathcal{B}\ basis\ of\ \mathbb{R}^{n}\end{subarray}}\max_{\mathbf{d}\in\mathcal{D}}\mathbf{u}_{\mathcal{B}}^{\top}\frac{\mathbf{d}}{\|\mathbf{d}\|}, (5)

while the cosine vector set is given by

c𝒱(𝒟)={𝐮:max𝐝𝒟𝐮𝐝𝐝=cm(𝒟)}.c\mathcal{V}(\mathcal{D})=\left\{\mathbf{u}_{\mathcal{B}}:\max_{\mathbf{d}\in\mathcal{D}}\mathbf{u}_{\mathcal{B}}^{\top}\tfrac{\mathbf{d}}{\|\mathbf{d}\|}=\operatorname{cm}\left(\mathcal{D}\right)\right\}.

The next proposition shows how, when dealing with OSPBs, formula (5) can be simplified to give a more direct link between the quantities γ\gamma_{\mathcal{B}} and the cosine measure.

Proposition 3.1

Let 𝒟n,s\mathcal{D}_{n,s} be an OSPB of n\mathbb{R}^{n} and 𝒟𝕃1,,𝒟𝕃s\mathcal{D}_{\mathbb{L}_{1}},\dots,\mathcal{D}_{\mathbb{L}_{s}} be a decomposition of 𝒟n,s\mathcal{D}_{n,s} into minimal positive bases. Then,

cm(𝒟n,s)=minn𝒟n,snbasisofnγnandc𝒱(𝒟n,s)={𝐮n:γn=cm(𝒟n,s)},\operatorname{cm}\left(\mathcal{D}_{n,s}\right)=\min_{\begin{subarray}{c}\mathcal{B}_{n}\,\subset\,\mathcal{D}_{n,s}\\ \mathcal{B}_{n}\ basis\ of\ \mathbb{R}^{n}\end{subarray}}\gamma_{\mathcal{B}_{n}}\quad\mbox{and}\quad c\mathcal{V}(\mathcal{D}_{n,s})=\{\mathbf{u}_{\mathcal{B}_{n}}:\gamma_{\mathcal{B}_{n}}=\operatorname{cm}\left(\mathcal{D}_{n,s}\right)\},

where γn\gamma_{\mathcal{B}_{n}} and 𝐮n\mathbf{u}_{\mathcal{B}_{n}} are computed according to (4).

Proof. Using the decomposition of 𝒟n,s\mathcal{D}_{n,s}, any linear basis n\mathcal{B}_{n} of n\mathbb{R}^{n} contained in 𝒟n,s\mathcal{D}_{n,s} can be decomposed as n=𝕃1𝕃s\mathcal{B}_{n}=\mathcal{B}_{\mathbb{L}_{1}}\cup\cdots\cup\mathcal{B}_{\mathbb{L}_{s}} where 𝕃i𝒟𝕃i\mathcal{B}_{\mathbb{L}_{i}}\subset\mathcal{D}_{\mathbb{L}_{i}} is a linear basis of 𝕃i\mathbb{L}_{i}. By construction, the vector 𝐮n\mathbf{u}_{\mathcal{B}_{n}} makes a positive dot product with every element of n\mathcal{B}_{n} (and thus with every element of any 𝕃i\mathcal{B}_{\mathbb{L}_{i}}) equal to γn\gamma_{\mathcal{B}_{n}}. Consequently, the vector 𝐮n\mathbf{u}_{\mathcal{B}_{n}} lies in the positive span of n\mathcal{B}_{n}, and its projection on any subspace 𝕃i\mathbb{L}_{i} lies in the positive span of 𝕃i\mathcal{B}_{\mathbb{L}_{i}}.

Meanwhile, for any i[[1,s]]i\in[\![1,s]\!], there exists 𝐝i𝒟𝕃i\mathbf{d}_{i}\in\mathcal{D}_{\mathbb{L}_{i}} such that 𝕃i=𝒟𝕃i{𝐝i}\mathcal{B}_{\mathbb{L}_{i}}=\mathcal{D}_{\mathbb{L}_{i}}\setminus\{\mathbf{d}_{i}\}, and this vector satisfies 𝐮n𝐝i<0\mathbf{u}_{\mathcal{B}_{n}}^{\top}\mathbf{d}_{i}<0 per Lemma 2.1(ii)(\ref{lem: PSS eq pos dotprod}), leading to

max𝐝𝒟n𝐮n𝐝𝐝=maxi[[1,s]]{max𝐝𝒟𝕃i𝐮n𝐝𝐝}=maxi[[1,s]]{max𝐝𝕃i𝐮n𝐝𝐝}=max𝐝n𝐮n𝐝𝐝=γn,\max_{\mathbf{d}\in\mathcal{D}_{n}}\frac{\mathbf{u}_{\mathcal{B}_{n}}^{\top}\mathbf{d}}{\|\mathbf{d}\|}=\max_{i\in[\![1,s]\!]}\left\{\max_{\mathbf{d}\in\mathcal{D}_{\mathbb{L}_{i}}}\frac{\mathbf{u}_{\mathcal{B}_{n}}^{\top}\mathbf{d}}{\|\mathbf{d}\|}\right\}=\max_{i\in[\![1,s]\!]}\left\{\max_{\mathbf{d}\in\mathcal{B}_{\mathbb{L}_{i}}}\frac{\mathbf{u}_{\mathcal{B}_{n}}^{\top}\mathbf{d}}{\|\mathbf{d}\|}\right\}=\max_{\mathbf{d}\in\mathcal{B}_{n}}\frac{\mathbf{u}_{\mathcal{B}_{n}}^{\top}\mathbf{d}}{\|\mathbf{d}\|}=\gamma_{\mathcal{B}_{n}},

where the second equality comes from 𝐮n𝐝i<0\mathbf{u}_{\mathcal{B}_{n}}^{\top}\mathbf{d}_{i}<0 and max𝐝𝒟n𝐮n𝐝>0\max\limits_{\mathbf{d}\in\mathcal{D}_{n}}\mathbf{u}_{\mathcal{B}_{n}}^{\top}\mathbf{d}>0, and the last equality holds by definition of γn\gamma_{\mathcal{B}_{n}}. Recalling that cm(𝒟n)\operatorname{cm}\left(\mathcal{D}_{n}\right) is given by (5) concludes the proof. \Box

One drawback of the approach described so far is that it requires computing the quantities (4) for all linear bases including in the positive spanning set of interest. In the case of OSPBs, we show below that the number of linear bases to be considered can be greatly reduced by leveraging to their decomposition into minimal positive bases.

Theorem 3.2

Let 𝒟n,s\mathcal{D}_{n,s} be an OSPB of n\mathbb{R}^{n} and 𝒟𝕃1,,𝒟𝕃s\mathcal{D}_{\mathbb{L}_{1}},\dots,\mathcal{D}_{\mathbb{L}_{s}} be a decomposition of 𝒟n,s\mathcal{D}_{n,s} into minimal positive bases. Let n\mathcal{B}_{n} be a linear basis contained in 𝒟n,s\mathcal{D}_{n,s} such that n=i=1s𝕃i\mathcal{B}_{n}=\cup_{i=1}^{s}\mathcal{B}_{\mathbb{L}_{i}}, where every 𝕃i\mathcal{B}_{\mathbb{L}_{i}} is a linear basis of the subspace corresponding to 𝒟𝕃i\mathcal{D}_{\mathbb{L}_{i}}. For each basis 𝕃i\mathcal{B}_{\mathbb{L}_{i}}, define

γ𝕃i=1𝟏𝕃i𝐆(𝐁𝕃i)1𝟏𝕃i,\gamma_{\mathcal{B}_{\mathbb{L}_{i}}}=\frac{1}{\sqrt{\mathbf{1}_{\mathbb{L}_{i}}^{\top}\mathbf{G}(\mathbf{B}_{\mathbb{L}_{i}})^{-1}\mathbf{1}_{\mathbb{L}_{i}}}},

where 𝐁𝕃i\mathbf{B}_{\mathbb{L}_{i}} is a matrix representation of 𝕃i\mathcal{B}_{\mathbb{L}_{i}} and 𝟏𝕃i\mathbf{1}_{\mathbb{L}_{i}} denotes the vector of all ones in dim(𝕃i)\mathbb{R}^{\dim(\mathbb{L}_{i})}. Then,

γn=1i=1sγ𝕃i2=1i=1s𝟏𝕃i𝐆(𝐁𝕃i)1𝟏𝕃i,\gamma_{\mathcal{B}_{n}}=\frac{1}{\sqrt{\sum_{i=1}^{s}\gamma_{\mathcal{B}_{\mathbb{L}_{i}}}^{-2}}}=\frac{1}{\sqrt{\sum_{i=1}^{s}\mathbf{1}_{\mathbb{L}_{i}}^{\top}\mathbf{G}(\mathbf{B}_{\mathbb{L}_{i}})^{-1}\mathbf{1}_{\mathbb{L}_{i}}}}, (6)

with γn\gamma_{\mathcal{B}_{n}} being defined as in (4).

As a result, the cosine measure and cosine vector set of 𝒟n,s\mathcal{D}_{n,s} are given by

cm(𝒟n,s)=1i=1smax𝐝i𝒟𝕃iγ𝒟𝕃i{𝐝i}2\operatorname{cm}\left(\mathcal{D}_{n,s}\right)=\frac{1}{\sqrt{\sum_{i=1}^{s}\max_{\mathbf{d}_{i}\in\mathcal{D}_{\mathbb{L}_{i}}}\gamma_{\mathcal{D}_{\mathbb{L}_{i}}\setminus\{\mathbf{d}_{i}\}}^{-2}}} (7)

and

c𝒱(𝒟n,s)={𝐮n:n=i=1s𝕃i,𝐮n=[𝐁𝕃1𝐁𝕃s]𝟏ni=1sγ𝕃i2}c\mathcal{V}(\mathcal{D}_{n,s})=\left\{\mathbf{u}_{\mathcal{B}_{n}}:\mathcal{B}_{n}=\cup_{i=1}^{s}\mathcal{B}_{\mathbb{L}_{i}},\quad\mathbf{u}_{\mathcal{B}_{n}}=\frac{[\mathbf{B}_{\mathbb{L}_{1}}\ \cdots\ \mathbf{B}_{\mathbb{L}_{s}}]^{-\top}\mathbf{1}_{n}}{\sqrt{\sum_{i=1}^{s}\gamma_{\mathcal{B}_{\mathbb{L}_{i}}}^{-2}}}\right\} (8)

respectively.

Proof. Let 𝐁n\mathbf{B}_{n} be a matrix representation of n\mathcal{B}_{n} such that 𝐁n=[𝐁𝕃1𝐁𝕃s]\mathbf{B}_{n}=\begin{bmatrix}\mathbf{B}_{\mathbb{L}_{1}}\cdots\mathbf{B}_{\mathbb{L}_{s}}\end{bmatrix}. Then, the Gram matrix of 𝐁n\mathbf{B}_{n} is 𝐆(𝐁n)=diag(𝐆(𝐁𝕃1),,𝐆(𝐁𝕃s))\mathbf{G}(\mathbf{B}_{n})=\operatorname{diag}\left(\mathbf{G}(\mathbf{B}_{\mathbb{L}_{1}}),\dots,\mathbf{G}(\mathbf{B}_{\mathbb{L}_{s}})\right), implying that

γn=1𝟏n𝐆(𝐁n)1𝟏n\displaystyle\gamma_{\mathcal{B}_{n}}=\frac{1}{\sqrt{\mathbf{1}_{n}^{\top}\mathbf{G}(\mathbf{B}_{n})^{-1}\mathbf{1}_{n}}} =\displaystyle= 1𝟏ndiag(𝐆(𝐁𝕃1),,𝐆(𝐁𝕃s))1𝟏n\displaystyle\frac{1}{\sqrt{\mathbf{1}_{n}^{\top}\operatorname{diag}\left(\mathbf{G}(\mathbf{B}_{\mathbb{L}_{1}}),\dots,\mathbf{G}(\mathbf{B}_{\mathbb{L}_{s}})\right)^{-1}\mathbf{1}_{n}}}
=\displaystyle= 1𝟏ndiag(𝐆(𝐁𝕃1)1,,𝐆(𝐁𝕃s)1)𝟏n\displaystyle\frac{1}{\sqrt{\mathbf{1}_{n}^{\top}\operatorname{diag}\left(\mathbf{G}(\mathbf{B}_{\mathbb{L}_{1}})^{-1},\dots,\mathbf{G}(\mathbf{B}_{\mathbb{L}_{s}})^{-1}\right)\mathbf{1}_{n}}}
=\displaystyle= 1i=1s𝟏𝕃i𝐆(𝐁𝕃i)1𝟏𝕃i,\displaystyle\frac{1}{\sqrt{\sum_{i=1}^{s}\mathbf{1}_{\mathbb{L}_{i}}^{\top}\mathbf{G}(\mathbf{B}_{\mathbb{L}_{i}})^{-1}\mathbf{1}_{\mathbb{L}_{i}}}},

which proves (6).

Recall now that every linear basis 𝕃i\mathcal{B}_{\mathbb{L}_{i}} can be written as 𝒟𝕃i{𝐝i}\mathcal{D}_{\mathbb{L}_{i}}\setminus\{\mathbf{d}_{i}\} for some 𝐝i𝒟𝕃i\mathbf{d}_{i}\in\mathcal{D}_{\mathbb{L}_{i}}. Combining this property together with (6) and Proposition 3.1, we obtain

cm(𝒟n,s)=minn𝒟n,snbasisofnγn\displaystyle\operatorname{cm}\left(\mathcal{D}_{n,s}\right)=\min_{\begin{subarray}{c}\mathcal{B}_{n}\subset\mathcal{D}_{n,s}\\ \mathcal{B}_{n}\ basis\ of\ \mathbb{R}^{n}\end{subarray}}\gamma_{\mathcal{B}_{n}} =\displaystyle= min𝕃1,,𝕃s𝕃i=𝒟𝕃i{𝐝i}1i=1sγ𝕃i2\displaystyle\min_{\begin{subarray}{c}\mathcal{B}_{\mathbb{L}_{1}},\dots,\mathcal{B}_{\mathbb{L}_{s}}\\ \mathcal{B}_{\mathbb{L}_{i}}=\mathcal{D}_{\mathbb{L}_{i}}\setminus\{\mathbf{d}_{i}\}\end{subarray}}\frac{1}{\sqrt{\sum_{i=1}^{s}\gamma_{\mathcal{B}_{\mathbb{L}_{i}}}^{-2}}}
=\displaystyle= min𝐝1,,𝐝s𝐝i𝒟𝕃i1i=1sγ𝒟𝕃i{𝐝i}2\displaystyle\min_{\begin{subarray}{c}\mathbf{d}_{1},\dots,\mathbf{d}_{s}\\ \mathbf{d}_{i}\in\mathcal{D}_{\mathbb{L}_{i}}\end{subarray}}\frac{1}{\sqrt{\sum_{i=1}^{s}\gamma_{\mathcal{D}_{\mathbb{L}_{i}}\setminus\{\mathbf{d}_{i}\}}^{-2}}}
=\displaystyle= 1i=1smax𝐝i𝒟𝕃iγ𝒟𝕃i{𝐝i}2,\displaystyle\frac{1}{\sqrt{\sum_{i=1}^{s}\max_{\mathbf{d}_{i}\in\mathcal{D}_{\mathbb{L}_{i}}}\gamma_{\mathcal{D}_{\mathbb{L}_{i}}\setminus\{\mathbf{d}_{i}\}}^{-2}}},

proving (7).

Finally, combining (7) with the result of Proposition 3.1 on the cosine vector set gives

c𝒱(𝒟n,s)\displaystyle c\mathcal{V}(\mathcal{D}_{n,s}) =\displaystyle= {𝐮n:γn=cm(𝒟n,s)}\displaystyle\left\{\mathbf{u}_{\mathcal{B}_{n}}:\gamma_{\mathcal{B}_{n}}=\operatorname{cm}\left(\mathcal{D}_{n,s}\right)\right\}
=\displaystyle= {𝐮n:n=i=1s𝕃i,1i=1sγ𝕃i2=cm(𝒟n,s)}\displaystyle\left\{\mathbf{u}_{\mathcal{B}_{n}}:\mathcal{B}_{n}=\cup_{i=1}^{s}\mathcal{B}_{\mathbb{L}_{i}},\quad\frac{1}{\sqrt{\sum_{i=1}^{s}\gamma_{\mathcal{B}_{\mathbb{L}_{i}}}^{-2}}}=\operatorname{cm}\left(\mathcal{D}_{n,s}\right)\right\}

Using a matrix representation 𝐁n=[𝐁𝕃1𝐁𝕃s]\mathbf{B}_{n}=\begin{bmatrix}\mathbf{B}_{\mathbb{L}_{1}}\cdots\mathbf{B}_{\mathbb{L}_{s}}\end{bmatrix} along with the formula (4) for 𝐮n\mathbf{u}_{\mathcal{B}_{n}} leads to

c𝒱(𝒟n,s)\displaystyle c\mathcal{V}(\mathcal{D}_{n,s}) =\displaystyle= {𝐮n:n=i=1s𝕃i,𝐮n=γn[𝐁𝕃1𝐁𝕃s]𝟏n}\displaystyle\left\{\mathbf{u}_{\mathcal{B}_{n}}:\mathcal{B}_{n}=\cup_{i=1}^{s}\mathcal{B}_{\mathbb{L}_{i}},\quad\mathbf{u}_{\mathcal{B}_{n}}=\gamma_{\mathcal{B}_{n}}[\mathbf{B}_{\mathbb{L}_{1}}\ \cdots\ \mathbf{B}_{\mathbb{L}_{s}}]^{-\top}\mathbf{1}_{n}\right\}
=\displaystyle= {𝐮n:n=i=1s𝕃i,𝐮n=[𝐁𝕃1𝐁𝕃s]𝟏ni=1sγ𝕃i2}\displaystyle\left\{\mathbf{u}_{\mathcal{B}_{n}}:\mathcal{B}_{n}=\cup_{i=1}^{s}\mathcal{B}_{\mathbb{L}_{i}},\quad\mathbf{u}_{\mathcal{B}_{n}}=\frac{[\mathbf{B}_{\mathbb{L}_{1}}\ \cdots\ \mathbf{B}_{\mathbb{L}_{s}}]^{-\top}\mathbf{1}_{n}}{\sqrt{\sum_{i=1}^{s}\gamma_{\mathcal{B}_{\mathbb{L}_{i}}}^{-2}}}\right\}

which is precisely (8). \Box

On a broader level, Theorem 3.2 shows that any calculation for a linear basis contained in an OSPB reduces to calculations on bases contained in each of the minimal positive bases in its decomposition. This observation leads to a principled way of computing the cosine measure from the OSPB decomposition, as described in Algorithm 1.

1
2Input: An orthogonally structured positive basis 𝒟n,s\mathcal{D}_{n,s} of n\mathbb{R}^{n} with normalized vectors.
3
4Step 1. Compute a decomposition 𝒟n,s=𝒟𝕃1𝒟𝕃s\mathcal{D}_{n,s}=\mathcal{D}_{\mathbb{L}_{1}}\cup\cdots\cup\mathcal{D}_{\mathbb{L}_{s}} of 𝒟n,s\mathcal{D}_{n,s} into ss minimal positive bases 𝒟𝕃i\mathcal{D}_{\mathbb{L}_{i}} on subspaces of n\mathbb{R}^{n}.
5
6Step 2. For every i[[1,s]]i\in[\![1,s]\!], compute
β𝕃i=max𝐝𝒟𝕃iγ𝒟𝕃i{𝐝}2and𝒜i=argmax𝐝𝒟𝕃iγ𝒟𝕃i{𝐝}2.\beta_{\mathbb{L}_{i}}=\max_{\mathbf{d}\in\mathcal{D}_{\mathbb{L}_{i}}}\gamma_{\mathcal{D}_{\mathbb{L}_{i}}\setminus\{\mathbf{d}\}}^{-2}\quad\mbox{and}\quad\mathcal{A}_{i}=\underset{\mathbf{d}\in\mathcal{D}_{\mathbb{L}_{i}}}{\operatorname{arg\,max}}\ \gamma_{\mathcal{D}_{\mathbb{L}_{i}}\setminus\{\mathbf{d}\}}^{-2}.
For all 𝐝𝒜i,\mathbf{d}\in\mathcal{A}_{i}, note 𝕃i(𝐝)=𝒟𝕃i{𝐝}.\mathcal{B}_{\mathbb{L}_{i}}^{(\mathbf{d})}=\mathcal{D}_{\mathbb{L}_{i}}\setminus\{\mathbf{d}\}.
7
Step 3. Return the cosine measure
cm(𝒟n,s)=1i=1sβ𝕃i\operatorname{cm}\left(\mathcal{D}_{n,s}\right)=\frac{1}{\sqrt{\sum_{i=1}^{s}\beta_{\mathbb{L}_{i}}}} (9)
and the cosine vector set
c𝒱(𝒟n,s)=(𝐝1,,𝐝s)𝒜1××𝒜s{𝐮n,𝐮=[𝐁𝕃1(𝐝1)𝐁𝕃s(𝐝s)]𝟏ni=1sβ𝕃i},c\mathcal{V}(\mathcal{D}_{n,s})=\bigcup\limits_{(\mathbf{d}_{1},\dots,\mathbf{d}_{s})\in\mathcal{A}_{1}\times\cdots\times\mathcal{A}_{s}}\left\{\mathbf{u}\in\mathbb{R}^{n},\mathbf{u}=\frac{\begin{bmatrix}\mathbf{B}_{\mathbb{L}_{1}}^{(\mathbf{d}_{1})}&\cdots&\mathbf{B}_{\mathbb{L}_{s}}^{(\mathbf{d}_{s})}\end{bmatrix}^{-\top}\mathbf{1}_{n}}{\sqrt{\sum_{i=1}^{s}\beta_{\mathbb{L}_{i}}}}\right\},
where 𝐁𝕃i(𝐝)\mathbf{B}_{\mathbb{L}_{i}}^{(\mathbf{d})} is a matrix representation of 𝕃i(𝐝).\mathcal{B}_{\mathbb{L}_{i}}^{(\mathbf{d})}.
Algorithm 1 Cosine measure of an orthogonally structured positive basis

In essence, Algorithm 1 is quite similar to the generic method proposed for positive spanning sets [6]. However, Algorithm 1 reduces the computation of the cosine measure to that over minimal positive bases, which leads to significantly cheaper calculations. Indeed, for any minimal positive basis 𝒟𝕃i\mathcal{D}_{\mathbb{L}_{i}} involved in the decomposition, the algorithm considers |𝒟𝕃i||\mathcal{D}_{\mathbb{L}_{i}}| positive bases, and computes the quantities of interest (4) for each of those (which amounts to inverting or solving a linear system involving the associated Gram matrix). Overall, Algorithm 1 thus checks i=1s|𝒟𝕃i|=|𝒟n,s|=n+s\sum_{i=1}^{s}|\mathcal{D}_{\mathbb{L}_{i}}|=|\mathcal{D}_{n,s}|=n+s bases to compute the cosine measure (9). This result represents a significant improvement over the (n+sn)\binom{n+s}{n} possible bases that are potentially required when the OSPB decomposition is not exploited, as in the algorithm for generic PSSs [6]. We also recall from Section 3.1 that Step 1 in Algorithm 1 can be performed in polynomial time.

To end this section, we illustrate how Algorithm 1 results in a straightforward calculation in the case of some minimal positive bases, by computing explicitly the cosine measure of the minimal positive basis n{𝟏n}\mathcal{I}_{n}\cup\{-\mathbf{1}_{n}\}. In this case, the calculation becomes easy thanks to the orthogonality of the vectors within n{𝟏}\mathcal{I}_{n}\cup\{-\mathbf{1}\}. Although the value (10) was recently stated [9, 20] and checked numerically using the method of Hare and Jarry-Bolduc [6], to the best of our knowledge the formal proof below is new.

Lemma 3.1

The cosine measure of the OSPB n{𝟏n}\mathcal{I}_{n}\cup\{-\mathbf{1}_{n}\} is given by

cm(n{𝟏n})=1n2+2(n1)n.\operatorname{cm}\left(\mathcal{I}_{n}\cup\{-\mathbf{1}_{n}\}\right)=\frac{1}{\sqrt{n^{2}+2(n-1)\sqrt{n}}}. (10)

Proof. For the sake of simplicity, we normalize the last vector in the set (which does not change the value of the cosine measure) and we let 𝒟n=n{1n𝟏n}\mathcal{D}_{n}=\mathcal{I}_{n}\cup\{-\tfrac{1}{\sqrt{n}}\mathbf{1}_{n}\} in the rest of the proof. Since 𝒟n\mathcal{D}_{n} is a positive basis, it follows that

cm(𝒟n)=min𝐝𝒟nγ𝒟n{𝐝},\operatorname{cm}\left(\mathcal{D}_{n}\right)=\min_{\mathbf{d}\in\mathcal{D}_{n}}\gamma_{\mathcal{D}_{n}\setminus\{\mathbf{d}\}},

Two cases are to be considered. Suppose first that 𝐝=1n𝟏n\mathbf{d}=-\tfrac{1}{\sqrt{n}}\mathbf{1}_{n}, so that 𝒟n{𝐝}=n\mathcal{D}_{n}\setminus\{\mathbf{d}\}=\mathcal{I}_{n}. Then, any matrix representation of that basis 𝐁\mathbf{B} is such that 𝐆(𝐁)=𝐈n\mathbf{G}(\mathbf{B})=\mathbf{I}_{n}. Consequently,

γ𝒟n{𝐝}=1𝟏n𝐆(𝐁)1𝟏n=1𝟏n𝟏n=1n.\gamma_{\mathcal{D}_{n}\setminus\{\mathbf{d}\}}=\frac{1}{\sqrt{\mathbf{1}_{n}^{\top}\mathbf{G}(\mathbf{B})^{-1}\mathbf{1}_{n}}}=\frac{1}{\sqrt{\mathbf{1}_{n}^{\top}\mathbf{1}_{n}}}=\frac{1}{\sqrt{n}}.

Suppose now that 𝐝=𝐞i\mathbf{d}=\mathbf{e}_{i} for some i[[1,n]]i\in[\![1,n]\!]. In that case, considering the matrix representation 𝐁i=[𝐞1𝐞i1𝐞i+1𝐞n1n𝟏n]\mathbf{B}_{i}=\begin{bmatrix}\mathbf{e}_{1}&\cdots&\mathbf{e}_{i-1}&\mathbf{e}_{i+1}&\cdots&\mathbf{e}_{n}&-\tfrac{1}{\sqrt{n}}\mathbf{1}_{n}\end{bmatrix}, we obtain that

𝐆(𝐁i)=[𝐈n11n𝟏n11n𝟏n11]and𝐆(𝐁i)1=[𝐈n1+𝟏n1𝟏n1n𝟏n1n𝟏n1n]\mathbf{G}(\mathbf{B}_{i})=\begin{bmatrix}\mathbf{I}_{n-1}&-\tfrac{1}{\sqrt{n}}\mathbf{1}_{n-1}\\ -\tfrac{1}{\sqrt{n}}\mathbf{1}_{n-1}^{\top}&1\end{bmatrix}\quad\mbox{and}\quad\mathbf{G}(\mathbf{B}_{i})^{-1}=\begin{bmatrix}\mathbf{I}_{n-1}+\mathbf{1}_{n-1}\mathbf{1}_{n-1}^{\top}&\sqrt{n}\mathbf{1}_{n-1}\\ \sqrt{n}\mathbf{1}_{n-1}^{\top}&n\end{bmatrix}

Since these formulas do not depend on ii, we obtain that for all i[[1,n]]i\in[\![1,n]\!], we have

γ𝒟n{𝐞i}=1𝟏n𝐆(𝐁i)1𝟏n=1j=1n=1n[𝐆(𝐁i)1]j.\gamma_{\mathcal{D}_{n}\setminus\{\mathbf{e}_{i}\}}=\frac{1}{\sqrt{\mathbf{1}_{n}^{\top}\mathbf{G}(\mathbf{B}_{i})^{-1}\mathbf{1}_{n}}}=\frac{1}{\sqrt{\sum_{j=1}^{n}\sum_{\ell=1}^{n}[\mathbf{G}(\mathbf{B}_{i})^{-1}]_{j\ell}}}.

Summing all coefficients of 𝐆(𝐁i)1\mathbf{G}(\mathbf{B}_{i})^{-1} gives n1+(n1)2+2(n1)n+n=n2+2(n1)nn-1+(n-1)^{2}+2(n-1)\sqrt{n}+n=n^{2}+2(n-1)\sqrt{n}, hence γ𝒟n{𝐞i}=1n2+2(n1)n\gamma_{\mathcal{D}_{n}\setminus\{\mathbf{e}_{i}\}}=\frac{1}{\sqrt{n^{2}+2(n-1)\sqrt{n}}}. Comparing this value with 1n\tfrac{1}{\sqrt{n}} yields the desired conclusion. \Box

Algorithm 1 allows for efficient calculation of cosine measures for specific positive spanning sets that originate from OSPBs, as we will establish in Section 4 in the case of positive kk-spanning sets.

4 Positive kk-spanning sets and kk-cosine measure

In this section, we are interested in positive spanning sets that retain their positively spanning ability when one or more of their elements are removed. Those sets were originally termed positive kk-spanning sets [13], and we adopt the same terminology. Section 4.1 recalls the key definitions for positive kk-spanning sets and positive kk-bases, while Section 4.2 introduces the kk-cosine measure, a generalization of the cosine measure from Section 2.3. Finally, Section 4.3 illustrates how to construct sets with guaranteed kk-cosine measure based on OSPBs.

4.1 Positive kk-spanning property

Our goal for this section is to provide a summary of results on positive kk-spanning sets that mimic the standard ones for positive spanning sets. We start by defining the property of interest.

Definition 4.1 (Positive kk-span and positive kk-spanning set)

Let 𝕃\mathbb{L} be a linear subspace of n\mathbb{R}^{n} and k1k\geq 1. The positive kk-span of a finite family of vectors 𝒟\mathcal{D} in 𝕃\mathbb{L}, denoted pspank(𝒟)\operatorname{pspan}_{k}(\mathcal{D}), is the set

pspank(𝒟):=𝒮𝒟|𝒮||𝒟|k+1pspan(𝒮).\operatorname{pspan}_{k}(\mathcal{D}):=\bigcap_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|\geq|\mathcal{D}|-k+1\end{subarray}}\operatorname{pspan}(\mathcal{S}).

A positive kk-spanning set (PkkSS) of 𝕃\mathbb{L} is a family of vectors 𝒟\mathcal{D} such that pspank(𝒟)=𝕃\operatorname{pspan}_{k}(\mathcal{D})=\mathbb{L}.

As for Definition 2.1, when 𝕃\mathbb{L} is not specified, a PkkSS should be understood as a PkkSS of n\mathbb{R}^{n}. By construction, any PkkSS of 𝕃\mathbb{L} must contain a PSS of 𝕃\mathbb{L}, and therefore is a PSS of 𝕃\mathbb{L} itself. Moreover, the notions of positive kk-span and PkkSS with k=1k=1 coincide with that of positive span and PSS from Definition 2.1. Similar to PSSs, we will omit the subspace of interest when 𝕃=n\mathbb{L}=\mathbb{R}^{n}.

The notion of positive spanning set is inherently connected to that of spanning set, a standard concept in linear algebra. We provide below a counterpart notion associated with PkkSSs.

Definition 4.2 (kk-span and kk-spanning set)

Let 𝒟\mathcal{D} be a finite family of vectors in n\mathbb{R}^{n}. the kk-span of 𝒟\mathcal{D}, denoted spank(𝒟)\operatorname{span}_{k}(\mathcal{D}), is defined by

spank(𝒟)=𝒮𝒟|𝒮||𝒟|k+1span(𝒮).\operatorname{span}_{k}(\mathcal{D})\;=\;\bigcap_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|\geq|\mathcal{D}|-k+1\end{subarray}}\operatorname{span}(\mathcal{S}).

Given a subspace 𝕃\mathbb{L} of n\mathbb{R}^{n}, a kk-spanning set of 𝕃\mathbb{L} is a family of vectors 𝒟\mathcal{D} such that spank(𝒟)=𝕃\operatorname{span}_{k}(\mathcal{D})=\mathbb{L}.

Using the two definitions above, we can generalize the results of Lemma 2.1 and Lemma 2.2 to positive kk-spanning sets. The proof is omitted as it merely amounts to combining the definitions with these two lemmas.

Lemma 4.1

Let 𝕃\mathbb{L} be a subspace of n\mathbb{R}^{n} and 𝒟\mathcal{D} a finite family of vectors in 𝕃\mathbb{L}. Let kk\in\mathbb{N} satisfy 1k|𝒟|1\leq k\leq|\mathcal{D}|. The following statements are equivalent:

  1. (i)

    𝒟\mathcal{D} is a PkkSS of 𝕃\mathbb{L}.

  2. (ii)

    For any nonzero vector 𝐮𝕃\mathbf{u}\in\mathbb{L}, there exist kk elements 𝐝1,,𝐝k\mathbf{d}_{1},\dots,\mathbf{d}_{k} of 𝒟\mathcal{D} such that 𝐮𝐝i>0\mathbf{u}^{\top}\mathbf{d}_{i}>0 for all i[[1,k]]i\in[\![1,k]\!].

  3. (iii)

    spank(𝒟)=𝕃\operatorname{span}_{k}(\mathcal{D})=\mathbb{L} and for any 𝒮𝒟\mathcal{S}\subset\mathcal{D} of cardinality |𝒟|k+1|\mathcal{D}|-k+1, the vector 𝟎n\mathbf{0}_{n} can be written as a positive linear combination of the elements of 𝒮\mathcal{S}.

Lemma 4.2

Let 𝕃\mathbb{L} be a subspace of n\mathbb{R}^{n} and let the finite set 𝒟\mathcal{D} be a PkkSS for 𝕃\mathbb{L}. Then, for any 𝐝𝒟\mathbf{d}\in\mathcal{D}, the kk-span of the family 𝒟{𝐝}\mathcal{D}\setminus\{\mathbf{d}\} is 𝕃\mathbb{L}.

The equivalence between statements (i)(\ref{equiv D pkss}) and (ii)(\ref{equiv k pos dot prod}) from Lemma 4.1 motivates the term “positive kk-spanning sets”. Indeed, given a PkkSS and a vector in the subspace it positively kk-spans, there exist kk elements of the PkkSS that make an acute angle with that vector. Note however that the equivalence between statements (i)(\ref{equiv D pkss}) and (iii)(\ref{equiv kpsan and 0pos}), as well as Definition 4.1, both provide an alternate characterization, namely that a PkkSS is a PSS that retains this property when removing k1k-1 of its elements. This latter characterization implies that a PkkSS must contain at least n+kn+k vectors, although this bound is only tight when k=1k=1. Indeed, in [13, Corollary 5] it is shown that the minimal size of a positive kk-spanning set in a subspace of dimension \ell is 2k+12k+\ell-1.

Throughout Section 2, we highlighted positive bases as a special class of positive spanning sets. We now define the counterpart notion for positive kk-spanning sets, that are termed positive kk-bases.

Definition 4.3 (Positive kk-basis)

Let 𝕃\mathbb{L} be a linear subspace of n\mathbb{R}^{n}. A positive kk-basis of 𝕃\mathbb{L} is a PkkSS 𝒟\mathcal{D} of 𝕃\mathbb{L} satisfying

𝐝𝒟,pspank(𝒟\{𝐝})𝕃.\forall\mathbf{d}\in\mathcal{D},\quad\operatorname{pspan}_{k}{(\mathcal{D}\backslash\{\mathbf{d}\})}\neq\mathbb{L}.

Positive kk-bases can be thought as inclusion-wise minimal positive kk-spanning sets (similarly to the characterization of positive bases from Section 2.1). We showed earlier how the notion of positive independence can be used to give an alternative definition for positive bases. Let us generalize this idea and introduce the concept of positive kk-independence, used to characterize positive kk-bases.

Definition 4.4 (Positive kk-independence)

Let 𝕃\mathbb{L} be a linear subspace of n\mathbb{R}^{n} of dimension 1\ell\geq 1 and let k1k\geq 1. A family of vectors 𝒟\mathcal{D} in 𝕃\mathbb{L} is positively kk-independent if |𝒟|k|\mathcal{D}|\geq k and, for any 𝐝𝒟\mathbf{d}\in\mathcal{D}, there exists a vector 𝐮𝕃\mathbf{u}\in\mathbb{L} having a positive dot product with exactly kk elements of 𝒟\mathcal{D}, including 𝐝\mathbf{d}.

Using this new concept, positive kk-bases can alternatively be defined as positively kk-independent PkkSSs. We mention in passing that the upper bound on the size of a positive kk-basis has yet to be determined, though it is known to exceed 2k2k\ell for an \ell-dimensional subspace [22].

4.2 The kk-cosine measure

This section aims at generalizing the cosine measure from Section 2.3 to positive kk-spanning sets so that it characterizes the positive kk-spanning property. Our proposal, called the kk-cosine measure, is described in Definition 4.5.

Definition 4.5 (kk-cosine measure)

Let 𝒟\mathcal{D} be a finite family of nonzero vectors in n\mathbb{R}^{n} and let kk\in\mathbb{N} satisfy 1k|𝒟|1\leq k\leq|\mathcal{D}|. The kk-cosine measure of 𝒟\mathcal{D} is given by

cmk(𝒟)=min𝐮=1𝐮nmax𝒮𝒟|𝒮|=kmin𝐝𝒮𝐮𝐝𝐝.\operatorname{cm}_{k}\left(\mathcal{D}\right)=\min_{\begin{subarray}{c}\|\mathbf{u}\|=1\\ \mathbf{u}\in\mathbb{R}^{n}\end{subarray}}\max_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=k\end{subarray}}\min_{\mathbf{d}\in\mathcal{S}}\frac{\mathbf{u}^{\top}\mathbf{d}}{\|\mathbf{d}\|}.

The kk-cosine vector set associated with 𝒟\mathcal{D} is given by

c𝒱k(𝒟)=argmin𝐮=1𝐮nmax𝒮𝒟|𝒮|=kmin𝐝𝒮𝐮𝐝𝐝.c\mathcal{V}_{k}(\mathcal{D})=\underset{\begin{subarray}{c}\|\mathbf{u}\|=1\\ \mathbf{u}\in\mathbb{R}^{n}\end{subarray}}{\operatorname{arg\,min}}\max_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=k\end{subarray}}\min_{\mathbf{d}\in\mathcal{S}}\frac{\mathbf{u}^{\top}\mathbf{d}}{\|\mathbf{d}\|}.

Note that Definition 2.7 is a special case of Definition 4.5 corresponding to k=1k=1. In its general form, this definition expresses how well the vectors of the family of interest are spread in the space through subsets of kk vectors, which is related to property (ii)(\ref{equiv k pos dot prod}) in Lemma 4.1. Our next result shows that this quantity characterizes PkkSSs.

Theorem 4.1

Let 𝒟\mathcal{D} be a finite family of vectors in n\mathbb{R}^{n} and let kk\in\mathbb{N} satisfy 1k|𝒟|1\leq k\leq|\mathcal{D}|. Then 𝒟\mathcal{D} is a positive kk-spanning set in n\mathbb{R}^{n} if and only if cmk(𝒟)>0\operatorname{cm}_{k}\left(\mathcal{D}\right)>0.

Proof. Without loss of generality we assume that all the elements of 𝒟\mathcal{D} are unit vectors.

Suppose first that 𝒟\mathcal{D} is a PPkSSSS. Then, by Lemma 4.1, for any unit vector 𝐮n\mathbf{u}\in\mathbb{R}^{n}, there exist kk vectors in 𝒟\mathcal{D} that make a positive dot product with 𝐮\mathbf{u}. Let 𝒮𝐮\mathcal{S}_{\mathbf{u}} be a subset of 𝒟\mathcal{D} consisting of these kk vectors. By construction, we have min𝐝𝒮𝐮𝐮𝐝>0\min\limits_{\mathbf{d}\in\mathcal{S}_{\mathbf{u}}}\mathbf{u}^{\top}\mathbf{d}>0 and thus

max𝒮𝒟|𝒮|=kmin𝐝𝒮𝐮𝐝min𝐝𝒮𝐮𝐮𝐝>0.\max_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=k\end{subarray}}\min_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}\geq\min_{\mathbf{d}\in\mathcal{S}_{\mathbf{u}}}\mathbf{u}^{\top}\mathbf{d}>0.

Since the result holds for any unit vector 𝐮\mathbf{u}, we conclude that cmk(𝒟)>0\operatorname{cm}_{k}\left(\mathcal{D}\right)>0.

Conversely, suppose that cmk(𝒟)>0\operatorname{cm}_{k}\left(\mathcal{D}\right)>0. For any unit vector 𝐮\mathbf{u}, we have by assumption that

max𝒮𝒟|𝒮|=kmin𝐝𝒮𝐮𝐝>0.\max_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=k\end{subarray}}\min_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}>0. (11)

In other words, at least kk elements of 𝒟\mathcal{D} have a positive dot product with 𝐮\mathbf{u}. This proves that 𝒟\mathcal{D} satisfies statement (ii)(\ref{equiv k pos dot prod}) from Lemma 4.1 and thus 𝒟\mathcal{D} is a positive kk-spanning set.

\Box

As announced in Section 2.3, the proof of Theorem 4.1 covers that of Proposition 2.1 as the special case k=1k=1.

To end this section, we provide an alternate definition of the kk-cosine measure that connects this notion to the cosine measure.

Theorem 4.2

Let 𝒟\mathcal{D} be a finite family of nonzero vectors in n\mathbb{R}^{n} and let kk\in\mathbb{N} satisfy 1k|𝒟|1\leq k\leq|\mathcal{D}|. Then,

cmk(𝒟)=min𝒮𝒟|𝒮|=|𝒟|k+1cm(𝒮).\operatorname{cm}_{k}\left(\mathcal{D}\right)\;=\;\min_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=|\mathcal{D}|-k+1\end{subarray}}\operatorname{cm}\left(\mathcal{S}\right). (12)

Proof. Without loss of generality, we assume that 𝒟={𝐝1,,𝐝m}\mathcal{D}=\{\mathbf{d}_{1},\dots,\mathbf{d}_{m}\} where each 𝐝i\mathbf{d}_{i} is a unit vector. In order to show that

min𝐮=1max𝒮𝒟|𝒮|=kmin𝐝𝒮𝐮𝐝=min𝒮𝒟|𝒮|=mk+1cm(𝒮),\min\limits_{\|\mathbf{u}\|=1}\max_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=k\end{subarray}}\min\limits_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}=\min_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=m-k+1\end{subarray}}\operatorname{cm}\left(\mathcal{S}\right),

we prove the equivalent statement

min𝐮=1max𝒮𝒟|𝒮|=kmin𝐝𝒮𝐮𝐝=min𝐮=1min𝒮𝒟|𝒮|=mk+1max𝐝𝒮𝐮𝐝.\min\limits_{\|\mathbf{u}\|=1}\max_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=k\end{subarray}}\min\limits_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}=\min\limits_{\|\mathbf{u}\|=1}\min_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=m-k+1\end{subarray}}\max\limits_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}. (13)

To this end, let 𝐮\mathbf{u} be a unit vector in n\mathbb{R}^{n}. We reorder the elements of 𝒟\mathcal{D} so that for all im1i\leq m-1, 𝐮𝐝i𝐮𝐝i+1\mathbf{u}^{\top}\mathbf{d}_{i}\geq\mathbf{u}^{\top}\mathbf{d}_{i+1} and we define 𝒮k={𝐝1,,𝐝k}\mathcal{S}_{k}^{-}=\{\mathbf{d}_{1},\dots,\mathbf{d}_{k}\} and 𝒮k+={𝐝k,,𝐝m}\mathcal{S}_{k}^{+}=\{\mathbf{d}_{k},\dots,\mathbf{d}_{m}\}. By construction, one has

𝐮𝐝k=min𝐝𝒮k𝐮𝐝=max𝐝𝒮k+𝐮𝐝.\mathbf{u}^{\top}\mathbf{d}_{k}=\min_{\mathbf{d}\in\mathcal{S}_{k}^{-}}\mathbf{u}^{\top}\mathbf{d}=\max_{\mathbf{d}\in\mathcal{S}_{k}^{+}}\mathbf{u}^{\top}\mathbf{d}. (14)

Moreover, the definitions of 𝒮k\mathcal{S}_{k}^{-} and 𝒮k+\mathcal{S}_{k}^{+} also imply that

𝒮kargmax𝒮𝒟|𝒮|=kmin𝐝𝒮𝐮𝐝and𝒮k+argmin𝒮𝒟|𝒮|=mk+1max𝐝𝒮𝐮𝐝.\mathcal{S}_{k}^{-}\in\underset{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=k\end{subarray}}{\operatorname{arg\,max}}\min_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}\quad\mbox{and}\quad\mathcal{S}_{k}^{+}\in\underset{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=m-k+1\end{subarray}}{\operatorname{arg\,min}}\max_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}. (15)

Combining (14) and (15) gives

max𝒮𝒟|𝒮|=kmin𝐝𝒮𝐮𝐝=min𝒮𝒟|𝒮|=mk+1max𝐝𝒮𝐮𝐝.\max_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=k\end{subarray}}\min_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}\;=\;\min_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=m-k+1\end{subarray}}\max_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}. (16)

Since (16) holds for any unit vector 𝐮n\mathbf{u}\in\mathbb{R}^{n}, it follows that

min𝐮=1max𝒮𝒟|𝒮|=kmin𝐝𝒮𝐮𝐝=min𝐮=1min𝒮𝒟|𝒮|=mk+1max𝐝𝒮𝐮𝐝,\min_{\|\mathbf{u}\|=1}\max_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=k\end{subarray}}\min_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d}\;=\;\min_{\|\mathbf{u}\|=1}\min_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}\\ |\mathcal{S}|=m-k+1\end{subarray}}\max_{\mathbf{d}\in\mathcal{S}}\mathbf{u}^{\top}\mathbf{d},

which is precisely (13) and, as a result, proves (12). \Box

The formula (12) is associated with another characterization of PkkSSs, namely that provided by Definition 4.1. In essence, the kk-cosine measure of a family 𝒟\mathcal{D} is the minimum among all subsets of 𝒟\mathcal{D} with cardinality |𝒟|k+1|\mathcal{D}|-k+1. A useful corollary of that result is that for any k2k\geq 2, one has

cm(𝒟)=cm1(𝒟)cm2(𝒟)cmk(𝒟).\operatorname{cm}\left(\mathcal{D}\right)=\operatorname{cm}_{1}\left(\mathcal{D}\right)\geq\operatorname{cm}_{2}\left(\mathcal{D}\right)\geq\cdots\geq\operatorname{cm}_{k}\left(\mathcal{D}\right).

In the next section, we will show how PkkSSs built using OSPBs satisfy stronger properties associated with the kk-cosine measure.

4.3 Building positive kk-spanning sets using OSPBs

In this section, we describe how OSPBs can be used to generate positive kk-spanning sets or positive kk-bases with guarantees on their kk-cosine measure.

A first approach towards constructing positive kk-spanning sets consists in duplicating the same PSS kk times, so that every direction remains even after taking out k1k-1 elements. However, such a strategy creates redundancy among the elements of the set. The purpose of this section is to introduce a more generic approach based on rotation matrices that allows for all vectors to be distinct.

Proposition 4.1

Let 𝒟n,s\mathcal{D}_{n,s} be an OSPB of n\mathbb{R}^{n} with s[[1,n]]s\in[\![1,n]\!]. Let 𝐑(1),,𝐑(k)\mathbf{R}^{(1)},\dots,\mathbf{R}^{(k)} be kk rotation matrices in n×n\mathbb{R}^{n\times n}, and define 𝒟n,s(j)\mathcal{D}_{n,s}^{(j)} as the family of vectors obtained by applying 𝐑(j)\mathbf{R}^{(j)} to each vector in 𝒟n,s\mathcal{D}_{n,s} for any j[[1,k]]j\in[\![1,k]\!]. Then, the set 𝒟n,s(1:k)=j=1k𝒟n,s(j)\mathcal{D}_{n,s}^{(1:k)}=\bigcup\limits_{j=1}^{k}\mathcal{D}_{n,s}^{(j)} is a positive kk-spanning set with

cmk(𝒟n,s(1:k))cm(𝒟n,s).\operatorname{cm}_{k}\left(\mathcal{D}_{n,s}^{(1:k)}\right)\geq\operatorname{cm}\left(\mathcal{D}_{n,s}\right). (17)

Proof. First, note that applying a rotation to every vector in a family does not change its cosine measure since rotations preserve angles, hence cm(𝒟n,s(j))=cm(𝒟n,s)\operatorname{cm}\left(\mathcal{D}_{n,s}^{(j)}\right)=\operatorname{cm}\left(\mathcal{D}_{n,s}\right) for every j[[1,k]]j\in[\![1,k]\!].

Consider a family 𝒮𝒟n,s(1:k)\mathcal{S}\subset\mathcal{D}_{n,s}^{(1:k)} with |𝒮|=|𝒟n,s(1:k)|k+1|\mathcal{S}|=\left|\mathcal{D}_{n,s}^{(1:k)}\right|-k+1. By the pigeonhole principle, this set must contain one of the kk positive bases obtained by rotation. Letting 𝒟n,s(j𝒮)\mathcal{D}_{n,s}^{(j_{\mathcal{S}})} denote that positive basis, we have cm(𝒮)cm(𝒟n,s(j𝒮))=cm(𝒟n,s)\operatorname{cm}\left(\mathcal{S}\right)\geq\operatorname{cm}\left(\mathcal{D}_{n,s}^{(j_{\mathcal{S}})}\right)=\operatorname{cm}\left(\mathcal{D}_{n,s}\right). Using Theorem 4.2, we obtain that

cmk(𝒟n,s(1:k))=min𝒮𝒟n,s(1:k)|𝒮|=|𝒟n,s(1:k)|k+1cm(𝒮)cm(𝒟n,s).\operatorname{cm}_{k}\left(\mathcal{D}_{n,s}^{(1:k)}\right)=\min_{\begin{subarray}{c}\mathcal{S}\subset\mathcal{D}_{n,s}^{(1:k)}\\ |\mathcal{S}|=\left|\mathcal{D}_{n,s}^{(1:k)}\right|-k+1\end{subarray}}\operatorname{cm}\left(\mathcal{S}\right)\geq\operatorname{cm}\left(\mathcal{D}_{n,s}\right).

In particular, this proves cmk(𝒟n,s(1:k))>0\operatorname{cm}_{k}\left(\mathcal{D}_{n,s}^{(1:k)}\right)>0, hence this set being positively kk-spanning now follows from Theorem 4.1. \Box

Several remarks are in order regarding Proposition 4.1. First, note that the result extends to any positive spanning set, but we focus on OSPBs since in that case (17) gives a lower bound on the kk-cosine measure that can be computed in polynomial time. Moreover, when 𝐑1==𝐑k=𝐈n\mathbf{R}_{1}=\dots=\mathbf{R}_{k}=\mathbf{I}_{n}, i.e.i.e. when kk identical copies of 𝒟n,s\mathcal{D}_{n,s} are used, the resulting family is a positive kk-basis and relation (17) holds with equality. As a result, its kk-cosine measure can be computed in polynomial time using Algorithm 1. In general, however, the family generated through Proposition 4.1 is not necessarily a positive kk-basis. For instance, if n=2n=2, setting 𝒟2={2,𝟏}\mathcal{D}_{2}=\{\mathcal{I}_{2},-\mathbf{1}\}, k=2k=2, 𝐑1=𝐈2\mathbf{R}_{1}=\mathbf{I}_{2} and 𝐑2=[1/23/23/21/2]\mathbf{R}_{2}=\begin{bmatrix}1/2&-\sqrt{3}/2\\ \sqrt{3}/2&1/2\end{bmatrix} yields a family that is a positive 22-spanning set but not a positive 22-basis.

We now present a strategy based on rotations tailored to OSPBs. Recall that OSPBs can be decomposed into minimal positive bases over orthogonal subspaces. Our proposal consists in applying separate rotations to each of those minimal positive bases. Our key result is described in Theorem 4.3, and shows that one can define a strategy for rotating a minimal positive basis to obtain a positive kk-basis. The proof relies on two technical results, the first one being an intrinsic property of minimal positive bases.

Lemma 4.3

Let 𝒟𝕃={𝐝1,,𝐝+1}\mathcal{D}_{\mathbb{L}}=\{\mathbf{d}_{1},\dots,\mathbf{d}_{\ell+1}\} be a minimal positive basis of an \ell-dimensional subspace 𝕃\mathbb{L} in n\mathbb{R}^{n}. Then, there exists 𝐯1,,𝐯+1𝕃\mathbf{v}_{1},\dots,\mathbf{v}_{\ell+1}\in\mathbb{L} satisfying

i[[1,+1]],𝐝i𝐯i>0and1j<i+1,𝐝i𝐯j<0.\forall i\in[\![1,\ell+1]\!],\quad\mathbf{d}_{i}^{\top}\mathbf{v}_{i}>0\quad\text{and}\quad\forall 1\leq j<i\leq\ell+1,\quad\mathbf{d}_{i}^{\top}\mathbf{v}_{j}<0. (18)

Proof. By Lemma 2.1(ii)\eqref{lem: PSS eq pos dotprod}, the zero vector 𝟎n\mathbf{0}_{n} can be obtained by a positive linear combination of all elements in 𝒟𝕃\mathcal{D}_{\mathbb{L}}. Without loss of generality, we rescale the elements of 𝒟𝕃\mathcal{D}_{\mathbb{L}} to ensure that i=1+1𝐝i=𝟎n\sum\limits_{i=1}^{\ell+1}\mathbf{d}_{i}=\mathbf{0}_{n}. Now, since 𝒟𝕃\mathcal{D}_{\mathbb{L}} is a minimal positive basis, the set 𝒟𝕃\{𝐝+1}\mathcal{D}_{\mathbb{L}}\backslash\{\mathbf{d}_{\ell+1}\} is a linear basis of 𝕃\mathbb{L}. Suppose that we augment this set with nn-\ell vectors to form a basis n\mathcal{B}_{n} of n\mathbb{R}^{n}, and let 𝐁n\mathbf{B}_{n} be a matrix representation of that basis such that the first \ell columns correspond to 𝐝1,,𝐝\mathbf{d}_{1},\dots,\mathbf{d}_{\ell}. Then, we have 𝐝i=𝐁n𝐞i\mathbf{d}_{i}=\mathbf{B}_{n}\mathbf{e}_{i} for any i[[1,]]i\in[\![1,\ell]\!], while 𝐝+1=𝐁n(j=1𝐞j)\mathbf{d}_{\ell+1}=-\mathbf{B}_{n}\left(\sum_{j=1}^{\ell}\mathbf{e}_{j}\right), where we recall that 𝐞1,,𝐞\mathbf{e}_{1},\dots,\mathbf{e}_{\ell} are the first \ell vectors of the canonical basis.

We now define the vectors

{𝐯i=𝐁n(j=1𝐞j+(+1)𝐞i)i[[1,]]𝐯+1=𝐁nj=1𝐞j.\left\{\begin{array}[]{lll}\mathbf{v}_{i}&=&\mathbf{B}_{n}^{-\top}\left(-\sum_{j=1}^{\ell}\mathbf{e}_{j}+(\ell+1)\mathbf{e}_{i}\right)\quad i\in[\![1,\ell]\!]\\ \mathbf{v}_{\ell+1}&=&-\mathbf{B}_{n}^{-\top}\sum_{j=1}^{\ell}\mathbf{e}_{j}.\end{array}\right.

Using orthogonality of the coordinate vectors in n\mathbb{R}^{n}, we have that 𝐝i𝐯i=𝐞i2=>0\mathbf{d}_{i}^{\top}\mathbf{v}_{i}=\ell\|\mathbf{e}_{i}\|^{2}=\ell>0 for every i[[1,]]i\in[\![1,\ell]\!] and 𝐯+1𝐝+1=j=1𝐞j2>0\mathbf{v}_{\ell+1}^{\top}\mathbf{d}_{\ell+1}=\left\|\sum_{j=1}^{\ell}\mathbf{e}_{j}\right\|^{2}>0. As a result, the first part of (18) holds.

In addition, for any 1j<i+11\leq j<i\leq\ell+1, we obtain that 𝐝i𝐯j=𝐞i2=1<0\mathbf{d}_{i}^{\top}\mathbf{v}_{j}=-\|\mathbf{e}_{i}\|^{2}=-1<0 if i<+1i<\ell+1 and

𝐝+1𝐯j=i=1𝐞i2(+1)𝐞j2=i=1𝐞i2(+1)𝐞j2=(+1)=1<0,\mathbf{d}_{\ell+1}^{\top}\mathbf{v}_{j}=\left\|\sum_{i=1}^{\ell}\mathbf{e}_{i}\right\|^{2}-(\ell+1)\|\mathbf{e}_{j}\|^{2}=\sum_{i=1}^{\ell}\|\mathbf{e}_{i}\|^{2}-(\ell+1)\|\mathbf{e}_{j}\|^{2}=\ell-(\ell+1)=-1<0,

thus the second part of (18) also holds. \Box

Note that Lemma 4.3 is not equivalent to Definition 2.3 as the inequalities in (18) are strict.

Our second technical result uses the result of Lemma 4.3 to define a quantity characteristic of the angles between vectors in a minimal positive basis.

Lemma 4.4

Suppose that n2n\geq\ell\geq 2 and let 𝒟𝕃={𝐝1,,𝐝+1}\mathcal{D}_{\mathbb{L}}=\{\mathbf{d}_{1},\dots,\mathbf{d}_{\ell+1}\} be a minimal positive basis of a \ell-dimensional subspace 𝕃\mathbb{L} in n\mathbb{R}^{n}. Let 𝐯1,,𝐯+1\mathbf{v}_{1},\dots,\mathbf{v}_{\ell+1} be vectors in 𝕃\mathbb{L} that satisfy (18). For any pair (i,i)[[1,+1]]2(i,i^{\prime})\in[\![1,\ell+1]\!]^{2}, let 𝐝i,i\mathbf{d}_{i,i^{\prime}} be the orthogonal projection of 𝐝i\mathbf{d}_{i} on 𝕃{𝐯i}\mathbb{L}\cap\{\mathbf{v}_{i^{\prime}}\}^{\perp}. Then, the quantity ρ𝒟𝕃:=maxi,i+1𝐝i𝐝i,i𝐝i𝐝i,i\rho_{\mathcal{D}_{\mathbb{L}}}:=\max\limits_{i,i^{\prime}\leq\ell+1}\frac{\mathbf{d}_{i}^{\top}\mathbf{d}_{i,i^{\prime}}}{\|\mathbf{d}_{i}\|\|\mathbf{d}_{i,i^{\prime}}\|} lies in the interval [0,1)[0,1).

Proof. Since 𝐝i,i\mathbf{d}_{i,i^{\prime}} is a projection of 𝐝i\mathbf{d}_{i} for any pair (i,i)(i,i^{\prime}), we have that 𝐝i𝐝i,i0\mathbf{d}_{i}^{\top}\mathbf{d}_{i,i^{\prime}}\geq 0, hence ρ𝒟𝕃0\rho_{\mathcal{D}_{\mathbb{L}}}\geq 0. Moreover, for any pair 1ii+11\leq i\leq i^{\prime}\leq\ell+1, we have 𝐝i𝐯i0\mathbf{d}_{i}^{\top}\mathbf{v}_{i^{\prime}}\neq 0 by (18), hence 𝐝i𝕃{𝐯j}\mathbf{d}_{i}\notin\mathbb{L}\cap\{\mathbf{v}_{j}\}^{\perp} and 𝐝i𝐝i,i𝐝i𝐝i,i<1\tfrac{\mathbf{d}_{i}^{\top}\mathbf{d}_{i,i^{\prime}}}{\|\mathbf{d}_{i}\|\|\mathbf{d}_{i,i^{\prime}}\|}<1. As a result, ρ𝒟𝕃<1\rho_{\mathcal{D}_{\mathbb{L}}}<1, completing the proof. \Box

The quantity defined in Lemma 4.4 is instrumental in defining suitable rotations matrices to produce a positive kk-basis. The construction is described in Theorem 4.3.

Theorem 4.3

Suppose that n2n\geq\ell\geq 2 and let 𝒟𝕃={𝐝1,,𝐝+1}\mathcal{D}_{\mathbb{L}}=\{\mathbf{d}_{1},\dots,\mathbf{d}_{\ell+1}\} be a minimal positive basis of an \ell-dimensional subspace 𝕃\mathbb{L} in n\mathbb{R}^{n}. Let ρ𝒟𝕃\rho_{\mathcal{D}_{\mathbb{L}}} be the quantity defined in Lemma 4.4 for some vectors 𝐯1,,𝐯+1\mathbf{v}_{1},\dots,\mathbf{v}_{\ell+1} satisfying (18). Finally, let 𝐑𝕃(1),,𝐑𝕃(k)\mathbf{R}^{(1)}_{\mathbb{L}},\dots,\mathbf{R}^{(k)}_{\mathbb{L}} be kk rotation matrices in n×n\mathbb{R}^{n\times n} such that

  1. (i)

    𝐑𝕃(j)𝐮=𝐮,\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{u}=\mathbf{u},\quad for any j[[1,k]]j\in[\![1,k]\!] and any vector 𝐮𝕃\mathbf{u}\in\mathbb{L}^{\perp},

  2. (ii)

    for any pair (j,j)[[1,k]]2(j,j^{\prime})\in[\![1,k]\!]^{2} and any pair (i,i)[[1,+1]]2(i,i^{\prime})\in[\![1,\ell+1]\!]^{2},

    [𝐑𝕃(j)𝐝i]𝐑𝕃(j)𝐝i𝐑𝕃(j)𝐝i𝐑𝕃(j)𝐝i=1j=jandi=i,\frac{\left[\mathbf{R}^{(j)}_{\mathbb{L}}\mathbf{d}_{i}\right]^{\top}\mathbf{R}^{(j^{\prime})}_{\mathbb{L}}\mathbf{d}_{i^{\prime}}}{\|\mathbf{R}^{(j)}_{\mathbb{L}}\mathbf{d}_{i}\|\|\mathbf{R}^{(j^{\prime})}_{\mathbb{L}}\mathbf{d}_{i^{\prime}}\|}=1\quad\Leftrightarrow\quad j=j^{\prime}\ \mbox{and}\ i=i^{\prime},
  3. (iii)

    [𝐑𝕃(j)𝐝i]𝐝i𝐑𝕃(j)𝐝i𝐝i>ρ𝒟𝕃for any pair(i,j)[[1,+1]]×[[1,k]]\frac{\left[\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i}\right]^{\top}\mathbf{d}_{i}}{\|\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i}\|\|\mathbf{d}_{i}\|}>\rho_{\mathcal{D}_{\mathbb{L}}}\quad\text{for any pair}\quad(i,j)\in[\![1,\ell+1]\!]\times[\![1,k]\!].

Denote by 𝒟𝕃(j)\mathcal{D}_{\mathbb{L}}^{(j)} the family obtained by applying 𝐑𝕃(j)\mathbf{R}_{\mathbb{L}}^{(j)} to each vector in 𝒟𝕃\mathcal{D}_{\mathbb{L}}. Then, the family 𝒟𝕃(1:k)=j=1k𝒟𝕃(j)\mathcal{D}_{\mathbb{L}}^{(1:k)}=\bigcup\limits_{j=1}^{k}\mathcal{D}_{\mathbb{L}}^{(j)} is a positive kk-basis of 𝕃\mathbb{L} with no identical elements.

Proof. Owing to assumptions (i)(\ref{stayinL}) and (ii)(\ref{noredundancy}), we know that 𝒟𝕃(1:k)\mathcal{D}_{\mathbb{L}}^{(1:k)} is a subset of 𝕃\mathbb{L} with pairwise distinct elements. Moreover, Proposition 4.1 guarantees that 𝒟𝕃(1:k)\mathcal{D}_{\mathbb{L}}^{(1:k)} is a PkkSS for 𝕃\mathbb{L}. Therefore we only need to show the positive kk-independence of this set.

To this end, we consider an index i[[1,+1]]i\in[\![1,\ell+1]\!] and show that the vector 𝐯i\mathbf{v}_{i} makes a positive dot product with exactly kk elements of 𝒟𝕃(1:k)\mathcal{D}_{\mathbb{L}}^{(1:k)}, these elements being 𝐑𝕃(1)𝐝i,,𝐑𝕃(k)𝐝i\mathbf{R}_{\mathbb{L}}^{(1)}\mathbf{d}_{i},\dots,\mathbf{R}_{\mathbb{L}}^{(k)}\mathbf{d}_{i}. For any j[[1,k]]j\in[\![1,k]\!], we deduce from condition (iii)(\ref{eq:smallangle}) in the theorem’s statement that

[𝐑𝕃(j)𝐝i]𝐝i𝐑𝕃(j)𝐝i𝐝i>ρ𝒟𝕃𝐝i𝐝i,j𝐝i𝐝i,j,\frac{\left[\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i}\right]^{\top}\mathbf{d}_{i}}{\|\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i}\|\|\mathbf{d}_{i}\|}>\rho_{\mathcal{D}_{\mathbb{L}}}\geq\frac{\mathbf{d}_{i}^{\top}\mathbf{d}_{i,j}}{\|\mathbf{d}_{i}\|\|\mathbf{d}_{i,j}\|},

where 𝐝i,j\mathbf{d}_{i,j} is defined as in Lemma 4.4. Letting Θ(𝐮,𝐮)\Theta(\mathbf{u},\mathbf{u}^{\prime}) denote the angle (with values in [0,π][0,\pi]) between two elements 𝐮\mathbf{u} and 𝐮\mathbf{u}^{\prime} in n\mathbb{R}^{n}, the above property translates to Θ(𝐑𝕃(j)𝐝i,𝐝i)<Θ(𝐝i,𝐝i,j)\Theta(\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i},\mathbf{d}_{i})<\Theta(\mathbf{d}_{i},\mathbf{d}_{i,j}). Using now that Θ(𝐮,𝐮)Θ(𝐮,𝐮′′)+Θ(𝐮′′,𝐮)\Theta(\mathbf{u},\mathbf{u}^{\prime})\leq\Theta(\mathbf{u},\mathbf{u}^{\prime\prime})+\Theta(\mathbf{u}^{\prime\prime},\mathbf{u}^{\prime}) for any vector triplet (𝐮,𝐮,𝐮′′)(n)3(\mathbf{u},\mathbf{u}^{\prime},\mathbf{u}^{\prime\prime})\in(\mathbb{R}^{n})^{3}, we obtain

Θ(𝐑𝕃(j)𝐝i,𝐯i)Θ(𝐑𝕃(j)𝐝i,𝐝i)+Θ(𝐝i,𝐯i)<Θ(𝐝i,i,𝐝i)+Θ(𝐝i,𝐯i)=π2,\Theta(\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i},\mathbf{v}_{i})\leq\Theta(\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i},\mathbf{d}_{i})+\Theta(\mathbf{d}_{i},\mathbf{v}_{i})<\Theta(\mathbf{d}_{i,i},\mathbf{d}_{i})+\Theta(\mathbf{d}_{i},\mathbf{v}_{i})=\frac{\pi}{2},

where the last equality comes from the fact that 𝐝i,i\mathbf{d}_{i,i} and 𝐯i\mathbf{v}_{i} are orthogonal vectors. We have thus established that 𝐯i𝐑𝕃(j)𝐝i>0\mathbf{v}_{i}^{\top}\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i}>0 for all j[[1,k]]j\in[\![1,k]\!].

It now remains to show that 𝐯i𝐑𝕃(j)𝐝i0\mathbf{v}_{i}^{\top}\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i^{\prime}}\leq 0 for any iii^{\prime}\neq i. First, using the same argument as above yields Θ(𝐯i,𝐝i)=π2+Θ(𝐝i,i,𝐝i)\Theta(\mathbf{v}_{i},\mathbf{d}_{i^{\prime}})=\frac{\pi}{2}+\Theta(\mathbf{d}_{i^{\prime},i},\mathbf{d}_{i^{\prime}}) as well as Θ(𝐯i,𝐑𝕃(j)𝐝i)Θ(𝐯i,𝐝i)Θ(𝐝i,𝐑𝕃(j)𝐝i)\Theta(\mathbf{v}_{i},\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i^{\prime}})\geq\Theta(\mathbf{v}_{i},\mathbf{d}_{i^{\prime}})-\Theta(\mathbf{d}_{i^{\prime}},\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i^{\prime}}). Applying condition (iii)(\ref{eq:smallangle}), we find that Θ(𝐝i,𝐑𝕃(j)𝐝i)<Θ(𝐝i,𝐝i,i)\Theta(\mathbf{d}_{i^{\prime}},\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i^{\prime}})<\Theta(\mathbf{d}_{i^{\prime}},\mathbf{d}_{i^{\prime},i}), which finally leads to

Θ(𝐯i,𝐑𝕃(j)𝐝i)Θ(𝐯i,𝐝i)Θ(𝐝i,𝐑𝕃(j)𝐝i)>π2,\Theta(\mathbf{v}_{i},\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i^{\prime}})\geq\Theta(\mathbf{v}_{i},\mathbf{d}_{i^{\prime}})-\Theta(\mathbf{d}_{i^{\prime}},\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i^{\prime}})>\frac{\pi}{2},

hence 𝐯i𝐑𝕃(j)𝐝i0\mathbf{v}_{i}^{\top}\mathbf{R}_{\mathbb{L}}^{(j)}\mathbf{d}_{i^{\prime}}\leq 0. Overall, we have established that the vector 𝐯i\mathbf{v}_{i} makes a positive scalar product with exactly kk vectors in 𝒟𝕃(1:k)\mathcal{D}_{\mathbb{L}}^{(1:k)}. Since any vector 𝐝𝒟𝕃(1:k)\mathbf{d}\in\mathcal{D}_{\mathbb{L}}^{(1:k)} is obtained by applying a rotation to some element 𝐝i𝒟𝕃(1:k)\mathbf{d}_{i}\in\mathcal{D}_{\mathbb{L}}^{(1:k)}, we conclude that 𝒟𝕃(1:k)\mathcal{D}_{\mathbb{L}}^{(1:k)} is positively kk-independent, proving the desired result. \Box

Note that Theorem 4.3 only applies when 2\ell\geq 2 (thus requiring n2n\geq 2 as well). Indeed, when the subspace 𝕃\mathbb{L} has dimension 11, none of the three conditions (i)(\ref{stayinL}), (ii)(\ref{noredundancy}) and (iii)(\ref{eq:smallangle}) can be fulfilled.

Remark 4.1

The first two conditions enforced on the rotation matrices in Theorem 4.3 are designed to produce distinct vectors in 𝕃\mathbb{L} without affecting the orthogonal complement of 𝕃\mathbb{L} in n\mathbb{R}^{n}. Indeed, Condition (i)(\ref{stayinL}) guarantees that the rotation leaves any vector orthogonal to 𝕃\mathbb{L} invariant. Condition (ii) enforces that all vectors produced by applying the rotations are distinct since the angle between two different vectors has cosine less than 1. Finally, condition (iii)(\ref{eq:smallangle}) ensures that the vectors are positively kk-independent. These conditions can be ensured through careful control of the eigenvalues of those rotation matrices, according to the angles between the vectors in 𝒟𝕃\mathcal{D}_{\mathbb{L}}.

Theorem 4.3 provides a principled way of building positive kk-bases from minimal positive bases. This approach naturally extends to OSPBs through their decomposition into orthogonal minimal positive bases. Using the technique described in Theorem 4.3 one can define rotation matrices that act on a single subspace from the decomposition of the OSPB without affecting the remaining subspaces thanks to orthogonality. The next corollary states the result.

Corollary 4.1

Let 𝒟n,s\mathcal{D}_{n,s} be an OSPB and let 𝒟𝕃1,,𝒟𝕃s\mathcal{D}_{\mathbb{L}_{1}},\dots,\mathcal{D}_{\mathbb{L}_{s}} be a decomposition (2) of 𝒟n,s\mathcal{D}_{n,s}. Assume that for every i[[1,s]]i\in[\![1,s]\!], dim(𝕃i)>1dim(\mathbb{L}_{i})>1. Let k1k\geq 1 and for every i[[1,s]]i\in[\![1,s]\!], let 𝐑𝕃i(1),,𝐑𝕃i(k)\mathbf{R}_{\mathbb{L}_{i}}^{(1)},\dots,\mathbf{R}_{\mathbb{L}_{i}}^{(k)} be kk rotation matrices satisfying the properties (i), (ii) and (iii) from Theorem 4.3 relative to 𝒟𝕃i\mathcal{D}_{\mathbb{L}_{i}}. Then, the set

𝒟n,s(1:k)=j=1k𝒟n,s(j),with𝒟n,s(j)=i=1s𝒟𝕃i(j)j[[1,k]],\mathcal{D}_{n,s}^{(1:k)}=\bigcup_{j=1}^{k}\mathcal{D}_{n,s}^{(j)},\qquad\mbox{with}\quad\mathcal{D}_{n,s}^{(j)}=\bigcup_{i=1}^{s}\mathcal{D}_{\mathbb{L}_{i}}^{(j)}\quad\forall j\in[\![1,k]\!],

is a positive kk-basis with cmk(𝒟n,s(1:k))cm(𝒟n,s)\operatorname{cm}_{k}\left(\mathcal{D}_{n,s}^{(1:k)}\right)\geq\operatorname{cm}\left(\mathcal{D}_{n,s}\right).

Corollary 4.1 thus provides a useful strategy to design positive kk-bases with guaranteed kk-cosine measure, since a lower bound on that quantity is given by cm(𝒟n,s)\operatorname{cm}\left(\mathcal{D}_{n,s}\right) and that measure can be efficiently computed through Algorithm 1. In particular, applying this strategy to a minimal positive basis yields a simple way of generating a positive kk-basis with k(n+1)k(n+1) vectors, as shown by Theorem 4.3 above.

Similarly to the result of Theorem 4.3, the result of Corollary 4.1 does not apply to all OSPBs. In particular, the maximal OSPB {n,n}\{\mathcal{I}_{n},-\mathcal{I}_{n}\} decomposes into nn minimal positive bases over one-dimensional subspaces, which precludes the application of Theorem 4.3. Note however that Proposition 4.1 still provides a way to compute PkkSSs with cosine measure guarantees for such a maximal OSPB.

5 Conclusion

In this paper, we have studied two classes of positive spanning sets. On the one hand, we have investigated orthogonally structured positive bases, that possess a favorable structure in that they decompose into minimal positive bases over orthogonal subspaces. By exploiting this property, we described an algorithm that computes this structure as well as the value of the cosine measure in polynomial time, thereby improving an existing procedure for generic positive spanning sets. On the other hand, we have conducted a detailed study of positive kk-spanning sets, a relatively understudied class of PSSs with resilient properties. We have provided several characterizations of these sets, including the generalization of the cosine measure through the kk-cosine measure. We have also leveraged OSPBs to build positive kk-spanning sets with guarantees on their kk-cosine measure based on rotations.

Our results open the way for several promising areas of research. We believe that the cosine measure calculation technique can be further improved for positive bases by leveraging their decomposition, although the presence of critical vectors poses a number of challenges to overcome for dealing with positive bases that are not orthogonally structured. Such results would also improve the practical calculation of kk-cosine measures. Finally, designing derivative-free optimization techniques based on positive kk-spanning sets with guarantees on their kk-cosine measure is a natural application of our study, and is the subject of ongoing work.

References

  • [1] C. Audet, A short proof on the cardinality of maximal positive bases, Optim. Letters, 5 (2011), pp. 191–194.
  • [2] C. Audet and W. Hare, Derivative-Free and Blackbox Optimization, Springer Series in Operations Research and Financial Engineering, Springer International Publishing, 2018.
  • [3] A. Conn, K. Scheinberg, and L. Vicente, Introduction to derivative-free optimization, vol. 8, Siam, 2009.
  • [4] C. Davis, Theory of positive linear dependence, American Journal of Mathematics, 76 (1954), pp. 733–746.
  • [5] W. Hare and G. Jarry-Bolduc, Calculus identities for generalized simplex gradients: Rules and applications, Submitted to SIAM Journal on Optimization, (2018).
  • [6]  , A deterministic algorithm to compute the cosine measure of a finite positive spanning set, Optim. Lett., 14 (2020), pp. 1305–1316.
  • [7] W. Hare, G. Jarry-Bolduc, and C. Planiden, Nicely structured positive bases with maximal cosine measure, Optim. Lett., (2023).
  • [8] J. Hopcroft and R. Tarjan, Algorithm 447: Efficient algorithms for graph manipulation, Commun. ACM, 16 (1973), p. 372–378.
  • [9] G. Jarry-Bolduc, Structures in Derivative Free Optimization: Approximating Gradients/ Hessians, and Positive Bases, PhD thesis, University of British Columbia, 2023. https://circle.ubc.ca/.
  • [10] T. G. Kolda, R. M. Lewis, and V. Torczon, Optimization by direct search: New perspectives on some classical and modern methods, SIAM Rev., 45 (2003), pp. 385–482.
  • [11] R. M. Lewis and V. Torczon, Rank ordering and positive bases in pattern search algorithms., tech. rep., Institute for computer applications in science and engineering, Hampton VA, 1996.
  • [12] D. A. Marcus, Minimal positive 2-spanning sets of vectors, Proceedings of the AMS, 82 (1981), pp. 165–172.
  • [13] D. A. Marcus, Gale diagrams of convex polytopes and positive spanning sets of vectors, Discrete Applied Mathematics, 9 (1984), pp. 47–67.
  • [14] A. Marsden, Eigenvalues of the Laplacian and their relationship to the connectedness of a graph, University of Chicago, REU, (2013).
  • [15] R. McKinney, Positive bases for linear spaces, Transactions of the American Mathematical Society, 103 (1962), pp. 131–148.
  • [16] G. Naevdal, Positive bases with maximal cosine measure, Optimization Letters, 13 (2019), pp. 1381–1388.
  • [17] R. Regis, The calculus of simplex gradients, Optimization Letters, 9 (2015), pp. 845–865.
  • [18] R. Regis, On the properties of positive spanning sets and positive bases, Optimization and Engineering, 17 (2016), pp. 229–262.
  • [19] R. Regis, On the properties of the cosine measure and the uniform angle subspace, Comput. Optim. Appl., 78 (2021), pp. 915–952.
  • [20] L. Roberts and C. W. Royer, Direct search based on probabilistic descent in reduced spaces. arXiv:2204.01275v2, 2023.
  • [21] Z. Romanowicz, Geometric structure of positive bases in linear spaces, Applicationes Mathematicae, 19 (1987), pp. 557–567.
  • [22] R. Wotzlaw, Incidence graphs and unneighborly polytopes, PhD thesis, Technischen Universität Berlin, 2009.