Using orthogonally structured positive bases for constructing positive -spanning sets with cosine measure guarantees
Abstract
Positive spanning sets span a given vector space by nonnegative linear combinations of their elements. These have attracted significant attention in recent years, owing to their extensive use in derivative-free optimization. In this setting, the quality of a positive spanning set is assessed through its cosine measure, a geometric quantity that expresses how well such a set covers the space of interest. In this paper, we investigate the construction of positive -spanning sets with geometrical guarantees. Our results build on recently identified positive spanning sets, called orthogonally structured positive bases. We first describe how to identify such sets and compute their cosine measures efficiently. We then focus our study on positive -spanning sets, for which we provide a complete description, as well as a new notion of cosine measure that accounts for the resilient nature of such sets. By combining our results, we are able to use orthogonally structured positive bases to create positive -spanning sets with guarantees on the value of their cosine measures.
Keywords
Positive spanning sets; Positive -spanning sets; Positive bases; Positive -bases; Cosine measure; -cosine measure; Derivative-free optimization.
2020 Mathematics Subject Classification
15A03; 15A21; 15B30; 15B99; 90C56.
1 Introduction
Positive spanning sets, or PSSs, are sets of vectors that span a space of interest through nonnegative linear combinations. These sets were first investigated by Davis [4], with a particular emphasis on inclusion-wise minimal PSSs, also called positive bases [4, 15]. Such a restriction guarantees bounds on the minimal and maximal size of a positive basis [1, 4]. PSSs of this form are now well understood in the literature [18]. Any positive basis is amenable to a decomposition into minimal positive bases over subspaces [21], a result that recently lead to the characterization of nicely structured positive bases [7], hereafter called orthogonally structured positive bases, or OSPBs.
Positive spanning sets and positive bases are now a standard tool to develop derivative-free optimization algorithms [5, 3]. The idea is to use directions from a PSS in replacement for derivative information. In that setting, it becomes critical to quantify how well directions from a PSS cover the space of interest, which is typically assessed through the cosine measure of this PSS [11, 10]. In general, no closed form expression is available for this cosine measure, yet various computing techniques have been proposed [6, 19]. In particular, a deterministic algorithm to compute the cosine measure of any PSS was recently proposed by Hare and Jarry-Bolduc [6]. The cosine measure of certain positive bases is known, along with upper bounds for positive bases of minimal and maximal sizes [16, 18]. The intermediate size case is less understood, even though recent progress was made in the case of orthogonally structured positive bases [7].
Meanwhile, another category of positive spanning sets introduced in the 1970s has received little attention in the literature. Those sets, called positive -spanning sets (PSSs), can be viewed as resilient PSSs, since at least of their elements must be removed in order to make them lose their positively spanning property [12]. Unlike standard PSSs, only partial results are known regarding inclusion-wise minimal PSS. These results depart from those for PSSs as they rely on polytope theory [13, 22]. Moreover, the construction of PSSs based on PSSs has not been fully explored. To the best of our knowledge, the cosine measure of PSSs has not been investigated, which partly prevents those sets from being used in derivative-free algorithms.
In this paper, we investigate positive -spanning sets through the lens of orthogonally structured positive bases and cosine measure. To this end, we refine previous results on OSPBs so as to obtain an efficient cosine measure calculation technique. We then explain how the notion of cosine measure can be generalized to account for the positive -spanning property, thereby introducing a quantity called the -cosine measure. To the best of our knowledge, this definition is new in both the derivative-free optimization and the positive spanning set literature. By combining those elements, we are able to build positive -spanning sets from OSPBs as well as to provide a bound on their -cosine measures. Our results pave the way for using positive -spanning sets in derivative-free algorithms.
The rest of this paper is organized as follows. Section 2 summarizes important properties of positive spanning sets, including their characterization through the cosine measure, with a particular focus on the subclass of orthogonally structured positive bases (OSPBs). Section 3 describes an efficient way to compute the cosine measure of an OSPB based on leveraging its decomposition. Section 4 formalizes the main properties associated with positive -spanning sets, introduces the -cosine measure to help studying these sets and uses OSPBs to design PSSs with guaranteed -cosine measure. Section 5 summarizes our findings and provides several perspectives of our work.
Notations
Throughout this paper, we will work in the Euclidean space with , or a linear subspace thereof, denoted by . More generally, blackboard bold letters will be used to designate infinite-valued sets, such as linear (sub)spaces and half-spaces. The Minkowski sum of two spaces and will be denoted by or by if the two spaces are orthogonal to one another. The orthogonal to a given subspace will be denoted accordingly by . In order to allow for repetition among their elements, positive spanning sets will be seen as families of vectors. We will use calligraphic letters such as to denote finite families of vectors. Given a family , the linear span of this family ( the set of linear combinations of its elements) will be denoted by . Bold lowercase (resp. uppercase) letters will be used to designate vectors (resp. matrices). The notations and will respectively be used to designate the null vector and the all-ones vector in , while will denote the family formed by coordinate basis vectors in . Given a family of vectors in , we will use to denote a matrix with rows and whose columns correspond to the elements in , in no particular order unless otherwise stated. We will use to denote a family obtained from by removing exactly one instance of the vector . Similarly, the notation will designate the family obtained from by adding one copy of the vector For instance, and . Finally, the notation will refer to the set .
2 Background on positive spanning sets
In this section, we introduce the main concepts and results on positive spanning sets that will be used throughout the paper. Section 2.1 defines positive spanning sets and positive bases. Section 2.2 introduces the concept of orthogonally structured positive bases, that were first studied under a different name [7]. Section 2.3 provides the definition of the cosine measure of a positive spanning set, along with its value for several commonly used positive bases.
2.1 Positive spanning sets and positive bases
In this section, we recall the definitions of positive spanning sets and positive bases, based on the seminal paper of Davis [4].
Definition 2.1 (Positive span and positive spanning set)
Let be a linear subspace of and . The positive span of a family of vectors in , denoted , is the set
A positive spanning set (PSS) of is a family of vectors such that .
When is not specified, a positive spanning set is understood as positively spanning . Throughout the paper, we only consider positive spanning sets formed of nonzero vectors. Note, however, that we do not force all elements of a positive spanning set to be distinct. The next two lemmas are well known and describe basic properties of positive spanning sets.
Lemma 2.1
[18, Theorem 2.5] Let be a finite set of vectors in a subspace of . The following statements are equivalent:
-
(i)
is a PSS of .
-
(ii)
For every nonzero vector , there exists an element such that .
-
(iii)
and can be written as a positive linear combination of the elements of .
Lemma 2.2
[4, Theorem 3.7] If is a PSS of , then for any the set linearly spans .
Note that Lemma 2.2 implies that a PSS must contain at least vectors. There is no upper bound on the number of elements that a PSS may contain, but such a restriction holds for a subclass of positive spanning sets called positive bases.
Definition 2.2 (Positive basis)
Let be a linear subspace of of dimension . A positive basis of of size , denoted , is a PSS of with elements satisfying:
When , such a set will be denoted by .
In other words, positive bases of are inclusion-wise minimal positive spanning sets of [3]. Positive bases can also be defined thanks to the notion of positive independence.
Definition 2.3 (Positive independence)
Let be a linear subspace of of dimension . A family of vectors in is positively independent if, for any , there exists a vector such that and for any .
A positive basis of is thus a positively independent PSS of .In the case , we simply say that is a positive basis (of size ).
Definition 2.4
Let be a linear subspace of of dimension and . A positive basis of is called minimal if , in which case we denote it by . The positive basis is called maximal if . If , we say that the positive basis has intermediate size.
2.2 Orthogonally structured positive bases
A complete characterization of positive bases can be provided through a subspace decomposition [21]. Such a decomposition is based upon the concept of critical vectors defined below.
Definition 2.5 (Critical vectors)
Let be a subspace of and be a positive basis of . A vector is called a critical vector of if it cannot replace an element of to form a positive spanning set, i.e.
Note that the zero vector is a critical vector for every positive basis of , and that it is the only critical vector for a maximal positive basis [21]. Moreover, if is a minimal positive basis and , the set of critical vectors (known as the complete critical set) is given by
Using critical vectors, one may decompose any positive basis as a union of minimal positive bases. Theorem 2.1 gives the result for a positive basis in , by adapting the original decomposition result due to Romanowicz [21, Theorem 1] (see also [7, Lemma 25]).
Theorem 2.1 (Structure of a positive basis)
Suppose that , , and consider a positive basis of . Then, either or there exist subspaces of such that and for any , and there exist associated minimal positive bases such that
(1) |
where for any , we let , and the vector is a critical vector for .
The result of Theorem 2.1 is actually an equivalence, in that any set that admits the decomposition (1) is a non-minimal positive basis of [21, Theorem 1]. The example below provides an illustration for both Definition 2.5 and Theorem 2.1. One can easily check that the stated vector is indeed critical, and that the proposed decomposition matches (1).
Example 2.1
Consider the following positive basis of :
The first four vectors of this set form a minimal positive basis for that admits as a critical vector. Therefore, a decomposition of the form (1) for is:
In general, however, the decomposition (1) is hard to compute, as it requires to determine the critical vectors and the subspaces , that need not be orthogonal to one another. These considerations have lead researchers to consider a subclass of positive bases with a nicer decomposition [7], leading to Definition 2.6 below.
Definition 2.6 (Orthogonally structured positive bases)
Suppose that , and consider a positive basis of . If there exist subspaces of such that and associated minimal positive bases such that
(2) |
then is called an orthogonally structured positive basis, or OSPB.
By construction, note that for any , any pair of elements satisfies .
The class of orthogonally structured positive bases includes common positive bases, such as minimal ones and certain maximal positive bases such as those formed by the coordinate vectors and their negatives in . More OSPBs can be obtained via numerical procedures, even for intermediate sizes [7]. As will be shown in Section 3, it is also possible to compute their cosine measure in an efficient manner.
2.3 The cosine measure
To end this background section, we define the cosine measure, a metric associated with a given family of vectors.
Definition 2.7 (Cosine measure and cosine vector set)
Let be a nonempty family of nonzero vectors in . The cosine measure of is given by
The cosine vector set associated with is given by
The cosine measure and cosine vector set provide insights on the geometry of the elements of in the space. Such concepts are of particular interest in the case of positive spanning sets, as shown by the result of Proposition 2.1. Its proof can be found in key references in derivative-free optimization [2, 3], but note that our analysis in Section 4 provides a more general proof in the context of positive -spanning sets.
Proposition 2.1
Let be a nonempty, finite family of vectors in . Then is a positive spanning set in if and only if .
Proposition 2.1 shows that the positive spanning property can be checked by computing the cosine measure. The value of the cosine measure further characterizes how well that set covers the space, which is a relevant information in derivative-free algorithms [10]. Both observations motivate the search for efficient techniques to compute the cosine measure.
We end this section by providing several examples of cosine measures for orthogonally structured positive bases, that form the core interest of this paper. We focus on building those positive bases from the coordinate vectors. A classical choice for a maximal positive basis consists in selecting the coordinate vectors and their negatives. In that case, it is known [16] that
One can also consider a minimal positive basis formed by the coordinate vectors and the sum of their negatives. Then, we have
To the best of our knowledge, the latter formula has only been recently established in [9, 20], and no formal proof is available in the literature. In the next section, we will develop an efficient cosine measure calculation technique for orthogonally structured positive bases that will provide a proof of this result as a byproduct.
3 Computing OSPBs and their cosine measure
In this section, we describe how orthogonally structured positive bases can be identified using the decomposition introduced in the previous section. We also leverage this decomposition to design algorithms for detecting the OSPB structure and computing the cosine measure of a given OSPB.
3.1 Structure and detection of OSPBs
The favorable structure of an OSPB can be revealed through the properties of its matrix representations, and in particular the Gram matrices associated with an OSPB, in the sense of Definition 3.1 below.
Definition 3.1 (Gram matrix)
Let be a finite family of vectors in and a matrix representation of this family. The Gram matrix of associated with is the matrix .
Given a Gram matrix of , any Gram matrix of has the form where is a permutation matrix. We will show that when is an OSPB, there exists a Gram matrix with a block-diagonal structure that reveals the decomposition of the basis, and thus its OSPB nature.
Theorem 3.1
Let , and be a positive basis. Then, is orthogonally structured if and only if one of its Gram matrices can be written as a block diagonal matrix with diagonal blocks.
Proof. Suppose first that is an OSPB associated to the decomposition
Let be a matrix representation of such that , where is a matrix representation of for every . By orthogonality of those matrices, the Gram matrix is then block diagonal with blocks corresponding to , and thus the desired result holds.
Conversely, suppose that there exists a matrix representation of such that is block diagonal with blocks. Then, by considering a partitioning with respect to those diagonal blocks, one can decompose into so that . By definition of the Gram matrix, this structure implies that the columns of and are orthogonal for any , thus we only need to show that each is a minimal positive basis for its linear span. To this end, we use the fact that is positively spanning. By Lemma 2.1(ii), there exists a vector with positive coefficients such that . By decomposing the vector into vectors according to the decomposition , we obtain that
(3) |
By orthogonality of the columns of the matrices, the property (3) is equivalent to
Using again Lemma 2.1(ii), this latter property is equivalent
to being a PSS for its linear span. Let be the
dimension of this span, and the number of elements in .
Since the are orthogonal and the columns of span
, we must have . In addition, we also have
. Since for all
, we conclude that , and thus every is
a minimal positive basis.
Note that the characterization stated by Theorem 3.1 trivially holds for minimal positive bases. This theorem thus provides a characterization of the OSPB property through Gram matrices. One can then use this result to determine whether a positive basis has an orthogonal structure and, in that event, to highlight the associated decomposition.
Corollary 3.1
Let , and be a positive basis in . Let be a matrix representation of . If there exists a permutation matrix such that the Gram matrix is block diagonal with blocks, then is an OSPB, and its decomposition (2) is given by the columns of in that order.
Corollary 3.1 provides a principled way of detecting the OSPB structure, by checking all possible permutations of the elements in the basis. Although this is not the focus of this work, we remark that it can be done in an efficient manner by considering a graph whose Laplacian matrix has non-zero coordinates in the exact same rows and columns as . Indeed, the problem of finding the permutation for matrix reduces to that of finding the number of connected components in the graph [14] and can be solved in polynomial time [8].
3.2 Cosine measure of an orthogonally structured positive basis
As seen in the previous section, orthogonally structured positive bases admit a decomposition into minimal positive bases that are orthogonal to one another. In this section, we investigate how this particular structure allows for efficient computation of the cosine measure.
Our approach builds on the algorithm introduced by Hare and Jarry-Bolduc [6], that computes the cosine measure of any positive spanning set in finite time (though the computation time may be exponential in the problem dimension). This procedure consists of a loop over all possible linear bases contained in the positive spanning set. More formally, given a positive spanning set in and a linear basis , we let be a matrix representation of , and the associated representation of . One then defines
(4) |
The vector is the only unitary vector such that for all [6, Lemmas 12 and 13]. Computing the quantities (4) for all linear bases contained in then gives both the cosine measure and the cosine vector set for [6, Theorem 19]. Indeed, we have
(5) |
while the cosine vector set is given by
The next proposition shows how, when dealing with OSPBs, formula (5) can be simplified to give a more direct link between the quantities and the cosine measure.
Proposition 3.1
Let be an OSPB of and be a decomposition of into minimal positive bases. Then,
where and are computed according to (4).
Proof. Using the decomposition of , any linear basis of contained in can be decomposed as where is a linear basis of . By construction, the vector makes a positive dot product with every element of (and thus with every element of any ) equal to . Consequently, the vector lies in the positive span of , and its projection on any subspace lies in the positive span of .
Meanwhile, for any , there exists such that , and this vector satisfies per Lemma 2.1, leading to
where the second equality comes from and
, and the last equality
holds by definition of . Recalling that is given
by (5) concludes the proof.
One drawback of the approach described so far is that it requires computing the quantities (4) for all linear bases including in the positive spanning set of interest. In the case of OSPBs, we show below that the number of linear bases to be considered can be greatly reduced by leveraging to their decomposition into minimal positive bases.
Theorem 3.2
Let be an OSPB of and be a decomposition of into minimal positive bases. Let be a linear basis contained in such that , where every is a linear basis of the subspace corresponding to . For each basis , define
where is a matrix representation of and denotes the vector of all ones in . Then,
(6) |
with being defined as in (4).
As a result, the cosine measure and cosine vector set of are given by
(7) |
and
(8) |
respectively.
Proof. Let be a matrix representation of such that . Then, the Gram matrix of is , implying that
which proves (6).
Recall now that every linear basis can be written as for some . Combining this property together with (6) and Proposition 3.1, we obtain
proving (7).
Finally, combining (7) with the result of Proposition 3.1 on the cosine vector set gives
Using a matrix representation along with the formula (4) for leads to
which is precisely (8).
On a broader level, Theorem 3.2 shows that any calculation for a linear basis contained in an OSPB reduces to calculations on bases contained in each of the minimal positive bases in its decomposition. This observation leads to a principled way of computing the cosine measure from the OSPB decomposition, as described in Algorithm 1.
(9) |
In essence, Algorithm 1 is quite similar to the generic method proposed for positive spanning sets [6]. However, Algorithm 1 reduces the computation of the cosine measure to that over minimal positive bases, which leads to significantly cheaper calculations. Indeed, for any minimal positive basis involved in the decomposition, the algorithm considers positive bases, and computes the quantities of interest (4) for each of those (which amounts to inverting or solving a linear system involving the associated Gram matrix). Overall, Algorithm 1 thus checks bases to compute the cosine measure (9). This result represents a significant improvement over the possible bases that are potentially required when the OSPB decomposition is not exploited, as in the algorithm for generic PSSs [6]. We also recall from Section 3.1 that Step 1 in Algorithm 1 can be performed in polynomial time.
To end this section, we illustrate how Algorithm 1 results in a straightforward calculation in the case of some minimal positive bases, by computing explicitly the cosine measure of the minimal positive basis . In this case, the calculation becomes easy thanks to the orthogonality of the vectors within . Although the value (10) was recently stated [9, 20] and checked numerically using the method of Hare and Jarry-Bolduc [6], to the best of our knowledge the formal proof below is new.
Lemma 3.1
The cosine measure of the OSPB is given by
(10) |
Proof. For the sake of simplicity, we normalize the last vector in the set (which does not change the value of the cosine measure) and we let in the rest of the proof. Since is a positive basis, it follows that
Two cases are to be considered. Suppose first that , so that . Then, any matrix representation of that basis is such that . Consequently,
Suppose now that for some . In that case, considering the matrix representation , we obtain that
Since these formulas do not depend on , we obtain that for all , we have
Summing all coefficients of gives
, hence
.
Comparing this value with yields the desired conclusion.
4 Positive -spanning sets and -cosine measure
In this section, we are interested in positive spanning sets that retain their positively spanning ability when one or more of their elements are removed. Those sets were originally termed positive -spanning sets [13], and we adopt the same terminology. Section 4.1 recalls the key definitions for positive -spanning sets and positive -bases, while Section 4.2 introduces the -cosine measure, a generalization of the cosine measure from Section 2.3. Finally, Section 4.3 illustrates how to construct sets with guaranteed -cosine measure based on OSPBs.
4.1 Positive -spanning property
Our goal for this section is to provide a summary of results on positive -spanning sets that mimic the standard ones for positive spanning sets. We start by defining the property of interest.
Definition 4.1 (Positive -span and positive -spanning set)
Let be a linear subspace of and . The positive -span of a finite family of vectors in , denoted , is the set
A positive -spanning set (PSS) of is a family of vectors such that .
As for Definition 2.1, when is not specified, a PSS should be understood as a PSS of . By construction, any PSS of must contain a PSS of , and therefore is a PSS of itself. Moreover, the notions of positive -span and PSS with coincide with that of positive span and PSS from Definition 2.1. Similar to PSSs, we will omit the subspace of interest when .
The notion of positive spanning set is inherently connected to that of spanning set, a standard concept in linear algebra. We provide below a counterpart notion associated with PSSs.
Definition 4.2 (-span and -spanning set)
Let be a finite family of vectors in . the -span of , denoted , is defined by
Given a subspace of , a -spanning set of is a family of vectors such that .
Using the two definitions above, we can generalize the results of Lemma 2.1 and Lemma 2.2 to positive -spanning sets. The proof is omitted as it merely amounts to combining the definitions with these two lemmas.
Lemma 4.1
Let be a subspace of and a finite family of vectors in . Let satisfy . The following statements are equivalent:
-
(i)
is a PSS of .
-
(ii)
For any nonzero vector , there exist elements of such that for all .
-
(iii)
and for any of cardinality , the vector can be written as a positive linear combination of the elements of .
Lemma 4.2
Let be a subspace of and let the finite set be a PSS for . Then, for any , the -span of the family is .
The equivalence between statements and from Lemma 4.1 motivates the term “positive -spanning sets”. Indeed, given a PSS and a vector in the subspace it positively -spans, there exist elements of the PSS that make an acute angle with that vector. Note however that the equivalence between statements and , as well as Definition 4.1, both provide an alternate characterization, namely that a PSS is a PSS that retains this property when removing of its elements. This latter characterization implies that a PSS must contain at least vectors, although this bound is only tight when . Indeed, in [13, Corollary 5] it is shown that the minimal size of a positive -spanning set in a subspace of dimension is .
Throughout Section 2, we highlighted positive bases as a special class of positive spanning sets. We now define the counterpart notion for positive -spanning sets, that are termed positive -bases.
Definition 4.3 (Positive -basis)
Let be a linear subspace of . A positive -basis of is a PSS of satisfying
Positive -bases can be thought as inclusion-wise minimal positive -spanning sets (similarly to the characterization of positive bases from Section 2.1). We showed earlier how the notion of positive independence can be used to give an alternative definition for positive bases. Let us generalize this idea and introduce the concept of positive -independence, used to characterize positive -bases.
Definition 4.4 (Positive -independence)
Let be a linear subspace of of dimension and let . A family of vectors in is positively -independent if and, for any , there exists a vector having a positive dot product with exactly elements of , including .
Using this new concept, positive -bases can alternatively be defined as positively -independent PSSs. We mention in passing that the upper bound on the size of a positive -basis has yet to be determined, though it is known to exceed for an -dimensional subspace [22].
4.2 The -cosine measure
This section aims at generalizing the cosine measure from Section 2.3 to positive -spanning sets so that it characterizes the positive -spanning property. Our proposal, called the -cosine measure, is described in Definition 4.5.
Definition 4.5 (-cosine measure)
Let be a finite family of nonzero vectors in and let satisfy . The -cosine measure of is given by
The -cosine vector set associated with is given by
Note that Definition 2.7 is a special case of Definition 4.5 corresponding to . In its general form, this definition expresses how well the vectors of the family of interest are spread in the space through subsets of vectors, which is related to property in Lemma 4.1. Our next result shows that this quantity characterizes PSSs.
Theorem 4.1
Let be a finite family of vectors in and let satisfy . Then is a positive -spanning set in if and only if .
Proof. Without loss of generality we assume that all the elements of are unit vectors.
Suppose first that is a k. Then, by Lemma 4.1, for any unit vector , there exist vectors in that make a positive dot product with . Let be a subset of consisting of these vectors. By construction, we have and thus
Since the result holds for any unit vector , we conclude that .
Conversely, suppose that . For any unit vector , we have by assumption that
(11) |
In other words, at least elements of have a positive dot product with . This proves that satisfies statement from Lemma 4.1 and thus is a positive -spanning set.
As announced in Section 2.3, the proof of Theorem 4.1 covers that of Proposition 2.1 as the special case .
To end this section, we provide an alternate definition of the -cosine measure that connects this notion to the cosine measure.
Theorem 4.2
Let be a finite family of nonzero vectors in and let satisfy . Then,
(12) |
Proof. Without loss of generality, we assume that where each is a unit vector. In order to show that
we prove the equivalent statement
(13) |
To this end, let be a unit vector in . We reorder the elements of so that for all , and we define and . By construction, one has
(14) |
Moreover, the definitions of and also imply that
(15) |
(16) |
Since (16) holds for any unit vector , it follows that
The formula (12) is associated with another characterization of PSSs, namely that provided by Definition 4.1. In essence, the -cosine measure of a family is the minimum among all subsets of with cardinality . A useful corollary of that result is that for any , one has
In the next section, we will show how PSSs built using OSPBs satisfy stronger properties associated with the -cosine measure.
4.3 Building positive -spanning sets using OSPBs
In this section, we describe how OSPBs can be used to generate positive -spanning sets or positive -bases with guarantees on their -cosine measure.
A first approach towards constructing positive -spanning sets consists in duplicating the same PSS times, so that every direction remains even after taking out elements. However, such a strategy creates redundancy among the elements of the set. The purpose of this section is to introduce a more generic approach based on rotation matrices that allows for all vectors to be distinct.
Proposition 4.1
Let be an OSPB of with . Let be rotation matrices in , and define as the family of vectors obtained by applying to each vector in for any . Then, the set is a positive -spanning set with
(17) |
Proof. First, note that applying a rotation to every vector in a family does not change its cosine measure since rotations preserve angles, hence for every .
Consider a family with . By the pigeonhole principle, this set must contain one of the positive bases obtained by rotation. Letting denote that positive basis, we have . Using Theorem 4.2, we obtain that
In particular, this proves , hence this set
being positively -spanning now follows from Theorem 4.1.
Several remarks are in order regarding Proposition 4.1. First, note that the result extends to any positive spanning set, but we focus on OSPBs since in that case (17) gives a lower bound on the -cosine measure that can be computed in polynomial time. Moreover, when , when identical copies of are used, the resulting family is a positive -basis and relation (17) holds with equality. As a result, its -cosine measure can be computed in polynomial time using Algorithm 1. In general, however, the family generated through Proposition 4.1 is not necessarily a positive -basis. For instance, if , setting , , and yields a family that is a positive -spanning set but not a positive -basis.
We now present a strategy based on rotations tailored to OSPBs. Recall that OSPBs can be decomposed into minimal positive bases over orthogonal subspaces. Our proposal consists in applying separate rotations to each of those minimal positive bases. Our key result is described in Theorem 4.3, and shows that one can define a strategy for rotating a minimal positive basis to obtain a positive -basis. The proof relies on two technical results, the first one being an intrinsic property of minimal positive bases.
Lemma 4.3
Let be a minimal positive basis of an -dimensional subspace in . Then, there exists satisfying
(18) |
Proof. By Lemma 2.1, the zero vector can be obtained by a positive linear combination of all elements in . Without loss of generality, we rescale the elements of to ensure that . Now, since is a minimal positive basis, the set is a linear basis of . Suppose that we augment this set with vectors to form a basis of , and let be a matrix representation of that basis such that the first columns correspond to . Then, we have for any , while , where we recall that are the first vectors of the canonical basis.
We now define the vectors
Using orthogonality of the coordinate vectors in , we have that for every and . As a result, the first part of (18) holds.
Our second technical result uses the result of Lemma 4.3 to define a quantity characteristic of the angles between vectors in a minimal positive basis.
Lemma 4.4
Suppose that and let be a minimal positive basis of a -dimensional subspace in . Let be vectors in that satisfy (18). For any pair , let be the orthogonal projection of on . Then, the quantity lies in the interval .
Proof. Since is a projection of for any pair ,
we have that , hence .
Moreover, for any pair , we have
by (18), hence
and
. As a
result, , completing the proof.
The quantity defined in Lemma 4.4 is instrumental in defining suitable rotations matrices to produce a positive -basis. The construction is described in Theorem 4.3.
Theorem 4.3
Suppose that and let be a minimal positive basis of an -dimensional subspace in . Let be the quantity defined in Lemma 4.4 for some vectors satisfying (18). Finally, let be rotation matrices in such that
-
(i)
for any and any vector ,
-
(ii)
for any pair and any pair ,
-
(iii)
.
Denote by the family obtained by applying to each vector in . Then, the family is a positive -basis of with no identical elements.
Proof. Owing to assumptions and , we know that is a subset of with pairwise distinct elements. Moreover, Proposition 4.1 guarantees that is a PSS for . Therefore we only need to show the positive -independence of this set.
To this end, we consider an index and show that the vector makes a positive dot product with exactly elements of , these elements being . For any , we deduce from condition in the theorem’s statement that
where is defined as in Lemma 4.4. Letting denote the angle (with values in ) between two elements and in , the above property translates to . Using now that for any vector triplet , we obtain
where the last equality comes from the fact that and are orthogonal vectors. We have thus established that for all .
It now remains to show that for any . First, using the same argument as above yields as well as . Applying condition , we find that , which finally leads to
hence . Overall, we have established
that the vector makes a positive scalar product with exactly
vectors in . Since any vector is obtained by applying
a rotation to some element , we conclude that is
positively -independent, proving the desired result.
Note that Theorem 4.3 only applies when (thus requiring as well). Indeed, when the subspace has dimension , none of the three conditions , and can be fulfilled.
Remark 4.1
The first two conditions enforced on the rotation matrices in Theorem 4.3 are designed to produce distinct vectors in without affecting the orthogonal complement of in . Indeed, Condition guarantees that the rotation leaves any vector orthogonal to invariant. Condition (ii) enforces that all vectors produced by applying the rotations are distinct since the angle between two different vectors has cosine less than 1. Finally, condition ensures that the vectors are positively -independent. These conditions can be ensured through careful control of the eigenvalues of those rotation matrices, according to the angles between the vectors in .
Theorem 4.3 provides a principled way of building positive -bases from minimal positive bases. This approach naturally extends to OSPBs through their decomposition into orthogonal minimal positive bases. Using the technique described in Theorem 4.3 one can define rotation matrices that act on a single subspace from the decomposition of the OSPB without affecting the remaining subspaces thanks to orthogonality. The next corollary states the result.
Corollary 4.1
Corollary 4.1 thus provides a useful strategy to design positive -bases with guaranteed -cosine measure, since a lower bound on that quantity is given by and that measure can be efficiently computed through Algorithm 1. In particular, applying this strategy to a minimal positive basis yields a simple way of generating a positive -basis with vectors, as shown by Theorem 4.3 above.
Similarly to the result of Theorem 4.3, the result of Corollary 4.1 does not apply to all OSPBs. In particular, the maximal OSPB decomposes into minimal positive bases over one-dimensional subspaces, which precludes the application of Theorem 4.3. Note however that Proposition 4.1 still provides a way to compute PSSs with cosine measure guarantees for such a maximal OSPB.
5 Conclusion
In this paper, we have studied two classes of positive spanning sets. On the one hand, we have investigated orthogonally structured positive bases, that possess a favorable structure in that they decompose into minimal positive bases over orthogonal subspaces. By exploiting this property, we described an algorithm that computes this structure as well as the value of the cosine measure in polynomial time, thereby improving an existing procedure for generic positive spanning sets. On the other hand, we have conducted a detailed study of positive -spanning sets, a relatively understudied class of PSSs with resilient properties. We have provided several characterizations of these sets, including the generalization of the cosine measure through the -cosine measure. We have also leveraged OSPBs to build positive -spanning sets with guarantees on their -cosine measure based on rotations.
Our results open the way for several promising areas of research. We believe that the cosine measure calculation technique can be further improved for positive bases by leveraging their decomposition, although the presence of critical vectors poses a number of challenges to overcome for dealing with positive bases that are not orthogonally structured. Such results would also improve the practical calculation of -cosine measures. Finally, designing derivative-free optimization techniques based on positive -spanning sets with guarantees on their -cosine measure is a natural application of our study, and is the subject of ongoing work.
References
- [1] C. Audet, A short proof on the cardinality of maximal positive bases, Optim. Letters, 5 (2011), pp. 191–194.
- [2] C. Audet and W. Hare, Derivative-Free and Blackbox Optimization, Springer Series in Operations Research and Financial Engineering, Springer International Publishing, 2018.
- [3] A. Conn, K. Scheinberg, and L. Vicente, Introduction to derivative-free optimization, vol. 8, Siam, 2009.
- [4] C. Davis, Theory of positive linear dependence, American Journal of Mathematics, 76 (1954), pp. 733–746.
- [5] W. Hare and G. Jarry-Bolduc, Calculus identities for generalized simplex gradients: Rules and applications, Submitted to SIAM Journal on Optimization, (2018).
- [6] , A deterministic algorithm to compute the cosine measure of a finite positive spanning set, Optim. Lett., 14 (2020), pp. 1305–1316.
- [7] W. Hare, G. Jarry-Bolduc, and C. Planiden, Nicely structured positive bases with maximal cosine measure, Optim. Lett., (2023).
- [8] J. Hopcroft and R. Tarjan, Algorithm 447: Efficient algorithms for graph manipulation, Commun. ACM, 16 (1973), p. 372–378.
- [9] G. Jarry-Bolduc, Structures in Derivative Free Optimization: Approximating Gradients/ Hessians, and Positive Bases, PhD thesis, University of British Columbia, 2023. https://circle.ubc.ca/.
- [10] T. G. Kolda, R. M. Lewis, and V. Torczon, Optimization by direct search: New perspectives on some classical and modern methods, SIAM Rev., 45 (2003), pp. 385–482.
- [11] R. M. Lewis and V. Torczon, Rank ordering and positive bases in pattern search algorithms., tech. rep., Institute for computer applications in science and engineering, Hampton VA, 1996.
- [12] D. A. Marcus, Minimal positive 2-spanning sets of vectors, Proceedings of the AMS, 82 (1981), pp. 165–172.
- [13] D. A. Marcus, Gale diagrams of convex polytopes and positive spanning sets of vectors, Discrete Applied Mathematics, 9 (1984), pp. 47–67.
- [14] A. Marsden, Eigenvalues of the Laplacian and their relationship to the connectedness of a graph, University of Chicago, REU, (2013).
- [15] R. McKinney, Positive bases for linear spaces, Transactions of the American Mathematical Society, 103 (1962), pp. 131–148.
- [16] G. Naevdal, Positive bases with maximal cosine measure, Optimization Letters, 13 (2019), pp. 1381–1388.
- [17] R. Regis, The calculus of simplex gradients, Optimization Letters, 9 (2015), pp. 845–865.
- [18] R. Regis, On the properties of positive spanning sets and positive bases, Optimization and Engineering, 17 (2016), pp. 229–262.
- [19] R. Regis, On the properties of the cosine measure and the uniform angle subspace, Comput. Optim. Appl., 78 (2021), pp. 915–952.
- [20] L. Roberts and C. W. Royer, Direct search based on probabilistic descent in reduced spaces. arXiv:2204.01275v2, 2023.
- [21] Z. Romanowicz, Geometric structure of positive bases in linear spaces, Applicationes Mathematicae, 19 (1987), pp. 557–567.
- [22] R. Wotzlaw, Incidence graphs and unneighborly polytopes, PhD thesis, Technischen Universität Berlin, 2009.