This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

[1]\fnmRadoslav \surHarman

[1]\orgdivFaculty of Mathematics, Physics and Informatics, \orgnameComenius University, \orgaddress\streetMlynská Dolina F1, \cityBratislava, \postcode842 48, \countrySlovakia

The Polytope of Optimal Approximate Designs: Extending the Selection of Informative Experiments

[email protected]    \fnmLenka \surFilová    \fnmSamuel \surRosa *
Abstract

Consider the problem of constructing an experimental design, optimal for estimating parameters of a given statistical model with respect to a chosen criterion. To address this problem, the literature usually provides a single solution. Often, however, there exists a rich set of optimal designs, and the knowledge of this set can lead to substantially greater freedom to select an appropriate experiment. In this paper, we demonstrate that the set of all optimal approximate designs generally corresponds to a polytope. Particularly important elements of the polytope are its vertices, which we call vertex optimal designs. We prove that the vertex optimal designs possess unique properties, such as small supports, and outline strategies for how they can facilitate the construction of suitable experiments. Moreover, we show that for a variety of situations it is possible to construct the vertex optimal designs with the assistance of a computer, by employing error-free rational-arithmetic calculations. In such cases the vertex optimal designs are exact, often closely related to known combinatorial designs. Using this approach, we were able to determine the polytope of optimal designs for some of the most common multifactor regression models, thereby extending the choice of informative experiments for a large variety of applications.

keywords:
Multifactor regression model, Optimal design of experiment, Polytope, Rational arithmetic, Vertex optimal design

1 Introduction

The optimal design of experiments is an important area of theoretical and applied statistics (e.g., Fedorov [17], Pázman [41], Pukelsheim [44], Atkinson et al. [2], Dean et al. [13]), closely related to various fields of mathematics and computer science, such as probability theory, orthogonal arrays, network theory, machine learning and mathematical programming (e.g., Dette and Studden [12], Hedayat et al. [30], Ghosh et al. [24], He [28], Todd [53]). In fact, one of the aims of this paper is to initiate a deeper exploration of the relationships between optimal experimental design, the theory of multidimensional polytopes, and rational-arithmetic computation.

From a statistical perspective, an optimal design is an experimental plan that maximizes the information contained in the observations.111In the following section, we will rigorously define the technical terms and prove all important claims used in the introduction. The specific measure of information, called a criterion, depends on the goal of the experiment. The most prominent are the criteria of DD- and AA-optimality, both measuring the precision of the parameter estimation for the underlying statistical model.

We will primarily focus on “approximate” experimental design (also called designs “for infinite-sample size”, or “continuous” designs), introduced by Kiefer [33]. An approximate design specifies a finite set of experimental conditions, which is called the support of the design, and the proportions of trials performed at each of these conditions, referred to as the design weights. In contrast, an exact design determines integer numbers of trials for the support points; often, these integer numbers are obtained from the proportions dictated by an approximate design, by means of appropriate rounding. We will use the term “design”, without any qualifier, to denote specifically an approximate experimental design.

The literature provides analytic forms of optimal designs (i.e., their supports and weights) for many statistical models and criteria. The focus is usually on obtaining a single optimal design; however, there is often an infinite set of optimal designs.

Example 1.

Consider an experiment with 33 continuous factors, the levels of which can be chosen anywhere within the interval [1,+1][-1,+1]. Assume that for different trials, the observations are independent, and at any combination (x1,x2,x3)(x_{1},x_{2},x_{3}) of factor levels, the observation satisfies the standard linear regression model

Yx1,x2,x3=β1+β2x1+β3x2+β4x3+ϵ,Y_{x_{1},x_{2},x_{3}}=\beta_{1}+\beta_{2}x_{1}+\beta_{3}x_{2}+\beta_{4}x_{3}+\epsilon, (1)

where β1,β2,β3,β4\beta_{1},\beta_{2},\beta_{3},\beta_{4} are the unknown parameters, and ϵ\epsilon is an unobservable error with E(ϵ)0\mathrm{E}(\epsilon)\equiv 0 and Var(ϵ)σ2(0,)\mathrm{Var}(\epsilon)\equiv\sigma^{2}\in(0,\infty). Let us choose the criterion of DD-optimality. Then, the uniform design that assigns 1/81/8 of the trials to each of the 88 factor level combinations in {1,+1}3\{-1,+1\}^{3}, i.e., an approximate-design analogue of the full factorial design with 33 binary factors, is optimal. However, consider any α[0,1]\alpha\in[0,1] and a design that assigns α/4\alpha/4 of the trials to each of the factor level combinations (1,1,1)(-1,-1,-1), (+1,+1,1)(+1,+1,-1), (+1,1,+1)(+1,-1,+1), (1,+1,+1)(-1,+1,+1), and (1α)/4(1-\alpha)/4 of the trials to each of the factor level combinations (+1,+1,+1)(+1,+1,+1), (1,1,+1)(-1,-1,+1), (1,+1,1)(-1,+1,-1), (+1,1,1)(+1,-1,-1). It is simple to show that such a design is also optimal. That is, there is a continuum of optimal designs.


The knowledge of more than one optimal design can be of theoretical as well as practical significance: alternative optimal designs may facilitate the best choice of the actual experiment. Of particular interest are optimal designs with small supports, because they may be suitable for assessing measurement errors, can be less expensive to execute, and can be more easily rounded to efficient exact designs. However, an optimal design with a large support may sometimes be more appropriate, because it may for instance better allow for the assessment of lack of fit of the model. Therefore, our purpose is to analyze the set of all optimal designs for a given experimental design problem.

The available literature only provides sporadic results on the nonuniqueness of optimal design and on the selection of a particular optimal design. The fact that the optimal design is not always unique, and the problem of finding optimal designs with special characteristics, such as the “minimal” optimal designs, has already been mentioned by Kiefer [33] and Farrell et al. [16]. The term “absolutely” minimal optimal design was introduced by Farrell et al. [16] and later used by Pesotchinsky [42]. Next, Pesotchinsky [43] analyzed optimal designs for the quadratic regression on cubic regions and stated conditions for the uniqueness of the optimal design in this model.

In multifactor models, the simplest optimal design is often supported on a large product grid. For some models of this type, alternative optimal designs supported on smaller sets have been constructed (e.g., Corollary 3.5 in Cheng [8], Theorem 2.1 in Lim and Studden [36], page 378 in Pukelsheim [44] and Section 5 in Schwabe and Wong [51]). A result of Schwabe and Wierich [50] provides conditions that characterize all DD-optimal designs in additive multifactor models with constant term and without interactions via the DD-optimal designs in the single-factor models. Further, for the additive model in crossover designs, there can be a multitude of optimal designs, and Kushner [34] suggests searching for them using linear programming. Focusing on applications, Robinson and Anderson-Cook [46] analyze the situation when there are several exact DD-optimal screening designs and suggest choosing the suitable one subject to a secondary characteristic. More recently Rosa and Harman [48] employed linear programming to select optimal designs with small supports in the models with treatment and nuisance effects.

In this paper, unlike most of the optimal design literature, we do not solve the problem of constructing a single optimal design. Instead, we assume that we already have one, or that it can be easily constructed using theoretical or numerical methods. We demonstrate that for a given model and criterion, in cases where the optimal design is not unique, the set of all optimal designs is typically a polytope. Additionally, we provide various theoretical properties of this polytope. We show that its vertices, which we call vertex optimal designs, can not only be used to succinctly describe all other optimal designs, but themselves have various useful characteristics, such as small supports.

Example 2.

In Example 1, the uniform design on {1,+1}3\{-1,+1\}^{3} is a trivial DD-optimal design. However, the set of designs from Example 1 parametrized by α[0,1]\alpha\in[0,1] is exactly the set of all DD-optimal designs for Model (1). That is, the polytope of optimal designs is a line segment, with two vertices. These vertices correspond to the two vertex optimal designs, defined by α=0\alpha=0 and α=1\alpha=1; they are the approximate design analogues of the two DD-optimal 2312^{3-1} fractional factorial designs. The vertex optimal designs are the only minimal optimal designs. That is, they are the only optimal designs with the following property: in the sense of set inclusion, there is no optimal design with a smaller support. On the other hand, note that the support of any optimal design corresponding to α(0,1)\alpha\in(0,1) is maximal in the sense that it covers the supports of all optimal designs.


As we will show, the polytope of optimal designs can be described in a relatively straightforward way as an intersection of halfspaces, which is called the H-representation of a polytope. However, enumerating the vertices of a polytope given by its H-representation generally requires extensive calculations. This is known as the vertex enumeration problem.

The vertex enumeration problem can be solved by the pivoting algorithms based on the simplex method of linear programming (e.g., Avis and Fukuda [4]) and the nonpivoting algorithms based on the double description method (Motzkin et al. [39], Avis and Jordan [5]). An overview of the vertex enumeration methods can be found in Matheiss and Rubin [38] or Chapter 8 in Fukuda [21]. In experimental design, the double description algorithm has been used by Snee [52] and by Coetzer and Haines [10] to compute the extreme points of the design space in Scheffè mixture models with multiple linear constraints. In He [29], the algorithm facilitated the construction of a special type of minimax space-filling designs.

Computationally, the general vertex enumeration problem is NP-complete (Khachiyan et al. [32]). Despite its theoretical complexity, a key property of this problem is that if the H-representation of the polytope involves only rational numbers, the coordinates of its vertices are also rational numbers. Moreover, in this case the vertex enumeration algorithms can execute all calculations via rational arithmetic, thus providing an error-free method of constructing the vertices (e.g., Fukuda [20]), which is akin to a computer-assisted proof. For a general exposition of error-free computing in geometric problems, see Eberly [14].

This fact has important consequences for the construction of vertex optimal designs of experiments. The reason is that for many optimal design problems, the H-representation of the optimal polytope is fully rational. Consequently, by employing the rational-arithmetic computations, we can provide a complete list of all vertex optimal designs for a variety of standard models and a large class of criteria, involving both DD- and AA-optimality. In addition, for the rational optimal design problems, the vertex optimal designs are exact, i.e., realizable with a finite number of trials. Often, they correspond to standard combinatorial designs of experiments.

It is also possible to apply floating-point versions of the vertex enumeration algorithms for arbitrary optimal design problems as a general algorithmic method of constructing the vertex optimal designs. However, the outputs of such algorithms are not always perfectly precise due to the nature of floating-point computations. Since our aim is to provide results that are completely reliable from the theoretical point of view, we only focus on rational optimal design problems.

The rest of this paper is organized as follows. Section 2.1 sets the background on the theory of optimal design. In Section 2.2, we introduce the notions of a maximal optimal design and maximum optimal support, which play important roles in the construction of the polytope of optimal designs. In Sections 2.3 and 2.4, we mathematically describe the optimal polytopes viewed as sets of finite-dimensional vectors and as finitely supported probability measures, respectively. Section 2.5 introduces the class of rational optimal design problems, for which it is possible to completely and reliably determine the optimal polytopes. Section 2.6 provides suggestions for utilizing the polytope of optimal designs for constructing experimental designs with specific properties. In Section 3, we characterize the polytopes of optimal designs for various common models and criteria. Section 4 offers concluding remarks, highlighting key insights of our findings and avenues for further study. Additionally, appendices include a reference list of the notation, all non-trivial proofs, and other supporting information.

2 Theory

2.1 Optimal design

Let 𝒳k\mathcal{X}\subset\mathbb{R}^{k} be a compact design space, that is, the set of all design points available for the trials. Note that such a definition of 𝒳\mathcal{X} includes both continuous and discrete spaces.

Consider a continuous function 𝐟:𝒳m\mathbf{f}:\mathcal{X}\to\mathbb{R}^{m}, m1m\geq 1, that defines a linear regression model on 𝒳\mathcal{X} in the following sense: provided that the experimental conditions of a trial are determined by the design point 𝐱𝒳\mathbf{x}\in\mathcal{X}, the observation is Y𝐱=𝐟(𝐱)β+ϵY_{\mathbf{x}}=\mathbf{f}^{\prime}(\mathbf{x})\beta+\epsilon, where βm\beta\in\mathbb{R}^{m} is a vector of unknown parameters, and ϵ\epsilon is an unobservable random error with expected value 0 and variance σ2(0,)\sigma^{2}\in(0,\infty). The random errors across different trials are assumed to be independent.

We remark that we restrict ourselves to linear homoscedastic regression models only for simplicity; the results of this paper can also be applied to nonlinear regression models through the approach of local optimality and to cases of non-constant error variance by means of a simple model transformation (e.g., [2], Chapters 17 and 22).

As is usual in the optimal design literature (e.g., Section 1.24 in [44]), we will formalize an approximate design as a probability measure ξ\xi on 𝒳\mathcal{X} with a finite support 𝒮(ξ):={𝐱1,,𝐱h}\mathcal{S}(\xi):=\{\mathbf{x}_{1},\ldots,\mathbf{x}_{h}\}, where hh is the support size, and 𝐱1,,𝐱h𝒳\mathbf{x}_{1},\ldots,\mathbf{x}_{h}\in\mathcal{X} are the support points. The interpretation is that for any 𝐱𝒮(ξ)\mathbf{x}\in\mathcal{S}(\xi) the number ξ(𝐱):=ξ({𝐱})>0\xi(\mathbf{x}):=\xi(\{\mathbf{x}\})>0 gives the proportion of the trials to be performed under the experimental conditions determined by the design point 𝐱\mathbf{x}. For clarity, we will sometimes represent the design ξ\xi by the table

[𝐱1𝐱2𝐱hξ(𝐱1)ξ(𝐱2)ξ(𝐱h)].\left[\begin{array}[]{cccc}\mathbf{x}_{1}&\mathbf{x}_{2}&\cdots&\mathbf{x}_{h}\\ \hline\cr\xi(\mathbf{x}_{1})&\xi(\mathbf{x}_{2})&\cdots&\xi(\mathbf{x}_{h})\\ \end{array}\right].

For a design ξ\xi, the normalized information matrix is the positive semi-definite matrix

𝐌(ξ)=𝐱𝒮(ξ)ξ(𝐱)𝐟(𝐱)𝐟(𝐱),\mathbf{M}(\xi)=\sum_{\mathbf{x}\in\mathcal{S}(\xi)}\xi(\mathbf{x})\mathbf{f}(\mathbf{x})\mathbf{f}^{\prime}(\mathbf{x}),

which is of size m×mm\times m. We will assume that the problem at hand is nondegenerate in the sense that there is a design ξ\xi such that 𝐌(ξ)\mathbf{M}(\xi) is nonsingular.

The quality of the design ξ\xi is measured by the “size” of 𝐌(ξ)\mathbf{M}(\xi), assessed through a utility function known as a criterion. For simplicity,222It is straightforward to use the methods developed in this paper with any information function Φ\Phi that is strictly concave in the sense of Section 5.2 in [44]. we will focus on Kiefer’s criteria Φp\Phi_{p} with a parameter p(,0]p\in(-\infty,0] in their information-function versions, as defined in Chapter 6 of Pukelsheim [44]. For a singular 𝐌(ξ)\mathbf{M}(\xi), the value of the Φp\Phi_{p}-criterion is 0, and for a non-singular 𝐌(ξ)\mathbf{M}(\xi) it is defined by

Φp(𝐌(ξ))={det1/m(𝐌(ξ)),if p=0,m1/ptr1/p(𝐌p(ξ)),if p<0.\Phi_{p}(\mathbf{M}(\xi))=\begin{cases}\det^{1/m}(\mathbf{M}(\xi)),&\text{if }p=0,\\ m^{-1/p}\mathrm{tr}^{1/p}(\mathbf{M}^{p}(\xi)),&\text{if }p<0.\end{cases}

This class includes the two most important criteria: DD-optimality (p=0p=0) and AA-optimality (p=1p=-1). We also implicitly cover the criterion of II-optimality (also called VV- or IVIV-optimality), because the problem of II-optimal design is equivalent to the problem of AA-optimal design in a linearly transformed model (e.g., Section 9.8 in Pukelsheim [44]).

For the chosen pp, the Φp\Phi_{p}-optimal design is any design ξ\xi^{*} that maximizes Φp(𝐌(ξ))\Phi_{p}(\mathbf{M}(\xi)) in the set of all designs ξ\xi. Note that our assumptions imply that there exists at least one Φp\Phi_{p}-optimal design.

The most important result on optimal approximate designs is the so-called equivalence theorem; for Kiefer’s criteria it states that a design ξ\xi^{*} is Φp\Phi_{p}-optimal if and only if 𝐌(ξ)\mathbf{M}(\xi^{*}) is non-singular and

𝐟(𝐱)𝐌p1(ξ)𝐟(𝐱)tr(𝐌p(ξ)) for all 𝐱𝒳,\mathbf{f}^{\prime}(\mathbf{x})\mathbf{M}^{p-1}(\xi^{*})\mathbf{f}(\mathbf{x})\leq\mathrm{tr}(\mathbf{M}^{p}(\xi^{*}))\text{ for all }\mathbf{x}\in\mathcal{X}, (2)

with equality for all 𝐱𝒮(ξ)\mathbf{x}\in\mathcal{S}(\xi^{*}) (e.g., Section 7.20 in Pukelsheim [44]). In addition, for several standard linear regression models (e.g., those analyzed in Sections 3.2, 3.3 and 3.4), there exists a simple design ξ\xi^{*} such that 𝐌(ξ)=λ𝐈m\mathbf{M}(\xi^{*})=\lambda\mathbf{I}_{m} for some λ>0\lambda>0 and

max𝐱𝒳f(𝐱)2=mλ.\max_{\mathbf{x}\in\mathcal{X}}\|f(\mathbf{x})\|^{2}=m\lambda. (3)

Then, by Corollary 4 of [26], the design ξ\xi^{*} is Φp\Phi_{p}-optimal for all parameters pp.

Let pp be fixed. While there can be an infinite number of Φp\Phi_{p}-optimal designs, the properties of the Kiefer’s criteria and the assumption of nondegeneracy imply that the information matrix of all Φp\Phi_{p}-optimal designs is the same; see, e.g., Section 8.14 in Pukelsheim [44]. We will denote this unique and necessarily nonsingular Φp\Phi_{p}-optimal information matrix by 𝐌\mathbf{M}_{*}. In the rest of this paper, if the criterion is general, fixed, or known from the context, we will drop the prefix Φp\Phi_{p}-.

Of particular importance are the notions of a minimal optimal design and an absolutely minimal optimal design, introduced by Farrell et al. [16]. An optimal design ξ\xi^{*} is a minimal optimal design if there is no optimal design, whose support is a proper subset of 𝒮(ξ)\mathcal{S}(\xi^{*}).333Some authors use the expression “minimally-supported design” to denote any design with the support size equal to the number of parameters (Chang and Lin [7], Li and Majumdar [35] and many others), which is a different concept. An optimal design ξ\xi^{*} is called an absolutely minimal optimal design if there is no optimal design, whose support has fewer elements than 𝒮(ξ)\mathcal{S}(\xi^{*}).

2.2 Maximum optimal support

As stated in the introduction, our aim is to characterize the set of all optimal designs, if we already know one optimal design. To this end, we will assume that we know a “maximal” optimal design ξ\xi^{*}, which we define as an optimal design satisfying 𝒮(ξ)𝒮(ξ~)\mathcal{S}(\xi^{*})\supseteq\mathcal{S}(\tilde{\xi}^{*}) for any other optimal design ξ~\tilde{\xi}^{*}. We emphasize that in many models, such as those analysed in Section 3, the known and simplest optimal design ξ\xi^{*} satisfies

𝐟(𝐳)𝐌p1𝐟(𝐳)<tr(𝐌p) for all 𝐳𝒳𝒮(ξ),\mathbf{f}^{\prime}(\mathbf{z})\mathbf{M}^{p-1}_{*}\mathbf{f}(\mathbf{z})<\mathrm{tr}(\mathbf{M}^{p}_{*})\text{ for all }\mathbf{z}\in\mathcal{X}\setminus\mathcal{S}(\xi^{*}), (4)

i.e, ξ\xi^{*} is maximal, owing to the equivalence theorem for Φp\Phi_{p}-optimality.

Of course, for some optimal design problems, a maximal optimal design is not immediately apparent. Even then, it is usually possible to construct one by a procedure that inspects either the design points satisfying equality in (2), or the points of a theoretically derived finite set necessarily covering the supports of all optimal designs; see, e.g., Section 3 in [31] or Lemma 4.2 in [22].444Since there is a large variety of models with directly available maximal optimal designs to justify the broad applicability of our method, see Section 3, we decided not to detail such an inspection procedure or the theoretical results on possible supports of optimal designs in this paper. Note that some very specific models, such as the trigonometric regression on [0,2π][0,2\pi] (Section 9.16 in Pukelsheim [44]), do not admit any maximal optimal design. This is because in these models the union of the supports of all optimal designs is an infinite set and, according to the standard definition we use, any design only has a finite support. We exclude such optimal design problems from our further considerations.

While there can be many maximal optimal designs, their supports must be the same; we will denote the unique “maximum optimal support” by 𝒴:={𝐲1,,𝐲d}\mathcal{Y}:=\{\mathbf{y}_{1},\ldots,\mathbf{y}_{d}\}, where dd is the size of the maximum optimal support, and the regressor vectors corresponding to the trials at 𝒴\mathcal{Y} by 𝐟i:=𝐟(𝐲i)\mathbf{f}_{i}:=\mathbf{f}(\mathbf{y}_{i}), for all i=1,,di=1,\ldots,d.

Therefore, if our aim is to study only optimal designs, we can completely restrict our attention to the set 𝒴\mathcal{Y} and disregard the design points from 𝒳𝒴\mathcal{X}\setminus\mathcal{Y}. Let us henceforth denote by Υ\Upsilon the set of all designs supported on 𝒴\mathcal{Y}.

Example 3.

Consider Model (1) from Examples 1 and 2. Here, the design space is 𝒳=[1,+1]3\mathcal{X}=[-1,+1]^{3}; that is, we have k=3k=3 continuous factors. The regression function is 𝐟(𝐱)=(1,x1,x2,x3)\mathbf{f}(\mathbf{x})=(1,x_{1},x_{2},x_{3})^{\prime} for all 𝐱=(x1,x2,x3)𝒳\mathbf{x}=(x_{1},x_{2},x_{3})^{\prime}\in\mathcal{X}, i.e., the number of parameters is m=4m=4. Let ξ\xi^{*} be the uniform design on the set 𝒴={1,+1}3\mathcal{Y}=\{-1,+1\}^{3} with d=8d=8 elements. A simple calculation yields 𝐌(ξ)=𝐲𝒴18𝐟(𝐲)𝐟(𝐲)=𝐈4\mathbf{M}(\xi^{*})=\sum_{\mathbf{y}\in\mathcal{Y}}\frac{1}{8}\mathbf{f}(\mathbf{y})\mathbf{f}^{\prime}(\mathbf{y})=\mathbf{I}_{4}. The design ξ\xi^{*} is optimal for any Φp\Phi_{p}, because 𝐟(𝐱)24\|\mathbf{f}(\mathbf{x})\|^{2}\leq 4 for all 𝐱𝒳\mathbf{x}\in\mathcal{X}, which is equivalent to (2). In addition, we clearly have 𝐟(𝐳)2<4\|\mathbf{f}(\mathbf{z})\|^{2}<4 for all 𝐳𝒳𝒴\mathbf{z}\in\mathcal{X}\setminus\mathcal{Y}, which is a transcription of (4); hence, ξ\xi^{*} is a maximal optimal design, and 𝒴\mathcal{Y} is the maximum optimal support. All Φp\Phi_{p}-optimal designs for Model (1) therefore belong to the set Υ\Upsilon of designs supported on 𝒴\mathcal{Y}.

2.3 Polytope of optimal weights

Because 𝒴\mathcal{Y} is finite, any design ζ\zeta in Υ\Upsilon can be characterized by the weights that it assigns to the design points 𝐲1,,𝐲d\mathbf{y}_{1},\ldots,\mathbf{y}_{d}. We will call 𝐰ζ=(ζ(𝐲1),,ζ(𝐲d))\mathbf{w}_{\zeta}=(\zeta(\mathbf{y}_{1}),\ldots,\zeta(\mathbf{y}_{d}))^{\prime} the vector of weights of ζ\zeta. This correspondence allows us to work with vectors from d\mathbb{R}^{d} instead of measures on 𝒴\mathcal{Y}. The components of each 𝐰ζ\mathbf{w}_{\zeta} must be non-negative and sum to one; the set of all weight-vectors on 𝒴\mathcal{Y} is therefore the probability simplex :={𝐰d:𝟏d𝐰=1,𝐰𝟎d}\mathbb{P}:=\{\mathbf{w}\in\mathbb{R}^{d}:\mathbf{1}^{\prime}_{d}\mathbf{w}=1,\mathbf{w}\geq\mathbf{0}_{d}\}. Analogously, given a vector of weights 𝐰\mathbf{w}\in\mathbb{P}, the design ζ𝐰\zeta_{\mathbf{w}} determined by 𝐰\mathbf{w} satisfies ζ𝐰(𝐲i)=wi\zeta_{\mathbf{w}}(\mathbf{y}_{i})=w_{i} for all i=1,,di=1,\ldots,d.

Because all optimal designs have the same information matrix, the set of all weight-vectors corresponding to the optimal designs is

:={𝐰:𝐌(ζ𝐰)=𝐌},\mathbb{P}_{*}:=\{\mathbf{w}\in\mathbb{P}:\mathbf{M}(\zeta_{\mathbf{w}})=\mathbf{M}_{*}\},

where 𝐌(ζ𝐰)=i=1dwi𝐟i𝐟i\mathbf{M}(\zeta_{\mathbf{w}})=\sum_{i=1}^{d}w_{i}\mathbf{f}_{i}\mathbf{f}^{\prime}_{i}. Since \mathbb{P} is a bounded superset of \mathbb{P}_{*}, and the equalities 𝐌(ζ𝐰)=𝐌\mathbf{M}(\zeta_{\mathbf{w}})=\mathbf{M}_{*} are linear in 𝐰\mathbf{w}, the set \mathbb{P}_{*} is a polytope555Because the definition of the term “polytope” somewhat varies across the mathematical literature, we remark that by polytope we mean a bounded convex polyhedron, i.e., a bounded intersection of closed half-spaces. For an introduction to the theory of polytopes, see, e.g., Ziegler [55]. in d\mathbb{R}^{d}. Therefore, we will say that \mathbb{P}_{*} is the “polytope of optimal weights”. A compact expression of the polytope of optimal weights utilizes the matrix

𝐀:=[vech(𝐟1𝐟1),,vech(𝐟d𝐟d)],\mathbf{A}:=\bigl{[}\mathrm{vech}\left(\mathbf{f}_{1}\mathbf{f}^{\prime}_{1}\right),\ldots,\mathrm{vech}\left(\mathbf{f}_{d}\mathbf{f}^{\prime}_{d}\right)\bigr{]},

where vech\mathrm{vech} means the half-vectorization of a symmetric matrix. Therefore, the number of rows of A is q:=12m(m+1)q:=\frac{1}{2}m(m+1).

Lemma 1.

The set of optimal design weights is

={𝐰𝟎d:𝐀𝐰=vech(𝐌)}.\mathbb{P}_{*}=\{\mathbf{w}\geq\mathbf{0}_{d}:\mathbf{A}\mathbf{w}=\mathrm{vech}(\mathbf{M}_{*})\}. (5)

The proof of Lemma 1 requires showing that 𝐌(ζ𝐰)=𝐌\mathbf{M}(\zeta_{\mathbf{w}})=\mathbf{M}_{*} for some 𝐰0\mathbf{w}\geq 0 entails 𝟏d𝐰=1\mathbf{1}_{d}^{\prime}\mathbf{w}=1; see Appendix B for the proofs. Note that (5) is geometrically the intersection of a finite number of halfspaces, that is, it is a H-representation of \mathbb{P}_{*}.

For the mathematical properties of the optimal design problem at hand, a key characteristic is the rank ss of 𝐀\mathbf{A}, which we will call the “problem rank”. In the following lemma, “dimension” of a subset of a vector space refers to the affine dimension of its affine hull.

Lemma 2.

(a) The bounds for the problem rank are msmin(d,q)m\leq s\leq\min(d,q); (b) The dimension of the set {𝐌(ζ):ζΥ}\{\mathbf{M}(\zeta):\zeta\in\Upsilon\} of the information matrices of all designs supported on Υ\Upsilon is s1s-1; (c) The dimension of the polytope \mathbb{P}_{*} of optimal design weights is t:=dst:=d-s.

A simple consequence of Lemma 2 is that the polytope \mathbb{P}_{*} is trivial (t=0t=0), i.e., the optimal design is unique, if and only if d=sd=s, which is if and only if 𝐀\mathbf{A} has full column rank. Note also that Lemma 2, parts (a) and (c), imply tdmt\leq d-m. It follows that for the case d=md=m the polytope \mathbb{P}_{*} is trivial.

Because the dimension of \mathbb{P}_{*} is t<dt<d, \mathbb{P}_{*} belongs to some proper affine subspace of d\mathbb{R}^{d}. However, we can transform \mathbb{P}_{*} to a full-dimensional polytope 𝕋t\mathbb{T}^{*}\subset\mathbb{R}^{t} that has the same geometric properties (the number and incidence of the faces of all dimensions, the distance between vertices and so on), as follows.

Let t>0t>0, let the columns of 𝐔=(𝐮1,,𝐮t)\mathbf{U}=(\mathbf{u}_{1},\ldots,\mathbf{u}_{t}) form an orthonormal basis of the null space 𝒩(𝐀)\mathcal{N}(\mathbf{A}) and let ξ\xi^{*} be a maximal optimal design with the vector of weights 𝐰>𝟎d\mathbf{w}^{*}>\mathbf{0}_{d}. Define an affine mapping AA at t\mathbb{R}^{t} by A(𝐭)=𝐔𝐭+𝐰A(\mathbf{t})=\mathbf{U}\mathbf{t}+\mathbf{w}^{*}. Note that the image of AA is the affine space 𝒜=𝒩(𝐀)+𝐰\mathcal{A}=\mathcal{N}(\mathbf{A})+\mathbf{w}^{*}\supset\mathbb{P}_{*}, and its inverse on 𝒜\mathcal{A} is A1(𝐰)=𝐔(𝐰𝐰)A^{-1}(\mathbf{w})=\mathbf{U}^{\prime}(\mathbf{w}-\mathbf{w}^{*}). Therefore, AA is an affine bijection between t\mathbb{R}^{t} and 𝒜\mathcal{A}, which means that the geometric structure of \mathbb{P}_{*} is the same as the structure of

𝕋:=A1()={𝐭t:𝐔𝐭𝐰}.\displaystyle\mathbb{T}_{*}:=A^{-1}(\mathbb{P}_{*})=\{\mathbf{t}\in\mathbb{R}^{t}:\mathbf{U}\mathbf{t}\geq-\mathbf{w}^{*}\}. (6)

The interior of 𝕋\mathbb{T}_{*} contains the origin 𝟎t\mathbf{0}_{t}. The full-dimensional polytope 𝕋\mathbb{T}_{*} can be used for the numerical construction of specific optimal designs as outlined in Section 2.6, as well as for a visualization of the set of all optimal designs if t3t\leq 3.

Key elements of \mathbb{P}_{*} are its vertices. Note that the number \ell of these vertices is finite; we will denote them 𝐯1,,𝐯\mathbf{v}_{1},\ldots,\mathbf{v}_{\ell}. The representation (5) of \mathbb{P}_{*} is in fact in the standard form of the feasible set of a linear program. As such, the well-established theory of linear programming can be used to give the following characterizations of 𝐯1,,𝐯\mathbf{v}_{1},\ldots,\mathbf{v}_{\ell}.

Theorem 1.

For 𝐰d\mathbf{w}\in\mathbb{R}^{d} let 𝒮(𝐰)={i{1,,d}:wi0}\mathcal{S}(\mathbf{w})=\{i\in\{1,\ldots,d\}:w_{i}\neq 0\}. The following statements are equivalent:

  1. 1.

    𝐯\mathbf{v} is a vertex of \mathbb{P}_{*}, i.e., 𝐯\mathbf{v} is the unique solution to the optimization problem min{𝐜𝐰:𝐰}\min\{\mathbf{c}^{\prime}\mathbf{w}:\mathbf{w}\in\mathbb{P}_{*}\} for some 𝐜d\mathbf{c}\in\mathbb{R}^{d}.

  2. 2.

    𝐯\mathbf{v} is an extreme point of \mathbb{P}_{*}, i.e., 𝐯\mathbf{v}\in\mathbb{P}_{*}, and if 𝐰1,𝐰2\mathbf{w}_{1},\mathbf{w}_{2}\in\mathbb{P}_{*} are such that 𝐯=α𝐰1+(1α)𝐰2\mathbf{v}=\alpha\mathbf{w}_{1}+(1-\alpha)\mathbf{w}_{2} for α(0,1)\alpha\in(0,1) then 𝐰1=𝐰2=𝐯\mathbf{w}_{1}=\mathbf{w}_{2}=\mathbf{v}.

  3. 3.

    𝐯\mathbf{v}\in\mathbb{P}_{*}, and the vectors {vech(𝐟i𝐟i):i𝒮(𝐯)}\{\mathrm{vech}\left(\mathbf{f}_{i}\mathbf{f}^{\prime}_{i}\right):i\in\mathcal{S}(\mathbf{v})\} are linearly independent.

  4. 4.

    𝐯\mathbf{v}\in\mathbb{P}_{*}, and if 𝐰\mathbf{w}\in\mathbb{P}_{*} is such that 𝒮(𝐰)𝒮(𝐯)\mathcal{S}(\mathbf{w})\subseteq\mathcal{S}(\mathbf{v}), then 𝒮(𝐰)=𝒮(𝐯)\mathcal{S}(\mathbf{w})=\mathcal{S}(\mathbf{v}).

  5. 5.

    𝐯\mathbf{v}\in\mathbb{P}_{*}, and if 𝐰\mathbf{w}\in\mathbb{P}_{*} is such that 𝒮(𝐰)𝒮(𝐯)\mathcal{S}(\mathbf{w})\subseteq\mathcal{S}(\mathbf{v}), then 𝐰=𝐯\mathbf{w}=\mathbf{v}.

Example 4.

Consider Model (1) from Examples 1-3, with the maximum optimal support 𝒴={1,+1}3\mathcal{Y}=\{-1,+1\}^{3} and the optimal information matrix 𝐌=𝐈4\mathbf{M}_{*}=\mathbf{I}_{4}. Denote the elements of 𝒴\mathcal{Y} lexicographically as 𝐲1=(1,1,1),𝐲2=(1,1,+1),,𝐲8=(+1,+1,+1)\mathbf{y}_{1}=(-1,-1,-1)^{\prime},\mathbf{y}_{2}=(-1,-1,+1)^{\prime},\ldots,\mathbf{y}_{8}=(+1,+1,+1)^{\prime}. By Lemma 1, ={𝐰8:𝐀𝐰=vech(𝐈4),𝐰𝟎8}\mathbb{P}_{*}=\{\mathbf{w}\in\mathbb{R}^{8}:\mathbf{A}\mathbf{w}=\mathrm{vech}(\mathbf{I}_{4}),\mathbf{w}\geq\mathbf{0}_{8}\}, where the columns of 𝐀\mathbf{A} are vech((1,𝐲i)(1,𝐲i))\mathrm{vech}((1,\mathbf{y}^{\prime}_{i})^{\prime}(1,\mathbf{y}^{\prime}_{i})), i=1,,8i=1,\ldots,8. The full form of 𝐀𝐰=vech(𝐈4)\mathbf{A}\mathbf{w}=\mathrm{vech}(\mathbf{I}_{4}) is provided as equation (24) in Appendix C. It is a matter of simple algebra to verify that the problem rank is rank(𝐀)=s=7\mathrm{rank}(\mathbf{A})=s=7. Consequently, part (c) of Lemma 2 implies that the dimension of \mathbb{P}_{*} is t=1t=1. Laborious but straightforward calculations show that all solutions to the system 𝐀𝐰=vech(𝐈4)\mathbf{A}\mathbf{w}=\mathrm{vech}(\mathbf{I}_{4}) are 𝐰=𝟏8/8+c(𝐬1𝐬2)\mathbf{w}=\mathbf{1}_{8}/8+c(\mathbf{s}_{1}-\mathbf{s}_{2}), where cc\in\mathbb{R}, 𝐬1=(0,1,1,0,1,0,0,1)\mathbf{s}_{1}=(0,1,1,0,1,0,0,1)^{\prime}, and 𝐬2=(1,0,0,1,0,1,1,0)\mathbf{s}_{2}=(1,0,0,1,0,1,1,0)^{\prime}. Adding 𝐰𝟎8\mathbf{w}\geq\mathbf{0}_{8} restricts the values of cc to the interval [1/8,1/8][-1/8,1/8]. Therefore, the elements of \mathbb{P}_{*} can be parametrized as 𝐰α=𝐬2/4+(α/4)(𝐬1𝐬2)\mathbf{w}_{\alpha}=\mathbf{s}_{2}/4+(\alpha/4)(\mathbf{s}_{1}-\mathbf{s}_{2}), α[0,1]\alpha\in[0,1]. Hence, \mathbb{P}_{*} is the line segment in 8\mathbb{R}^{8} with endpoints (vertices) 𝐯1=𝐬1/4\mathbf{v}_{1}=\mathbf{s}_{1}/4 and 𝐯2=𝐬2/4\mathbf{v}_{2}=\mathbf{s}_{2}/4. Finally, since the vector of weights of a maximal optimal design is 𝐰=18𝟏8\mathbf{w}^{*}=\frac{1}{8}\mathbf{1}_{8}, 𝒩(𝐀)\mathcal{N}(\mathbf{A}) has dimension t=1t=1, and its one-element orthonormal basis is 𝐔=(𝐬1𝐬2)/8\mathbf{U}=(\mathbf{s}_{1}-\mathbf{s}_{2})/\sqrt{8}, formula (6) yields that a full-dimensional “polytope” isomorphic to \mathbb{P}_{*} is the interval 𝕋=[8,8]\mathbb{T}_{*}=[-\sqrt{8},\sqrt{8}]. For this simple problem, one can also directly verify the characterization of the vertices 𝐯1\mathbf{v}_{1} and 𝐯2\mathbf{v}_{2} given by Theorem 1.

2.4 Polytope of optimal designs

The central object of this paper is the set Υ\Upsilon_{*} of all Φp\Phi_{p}-optimal designs, i.e.,

Υ:={ζΥ:𝐌(ζ)=𝐌}.\Upsilon_{*}:=\{\zeta\in\Upsilon:\mathbf{M}(\zeta)=\mathbf{M}_{*}\}.

Here we formulate corollaries of the results of Section 2.3 in the standard optimal design terminology, so that it is clear how these results translate from the weights 𝐰d\mathbf{w}\in\mathbb{R}^{d} to actual designs (i.e., finitely supported probability measures).

First, we show that the mathematical properties indeed translate to designs ζΥ\zeta\in\Upsilon by formalizing the transformation ζ𝐰ζ\zeta\mapsto\mathbf{w}_{\zeta} as a linear isomorphism. Let Υ±\Upsilon^{\pm} be the linear space of all signed measures supported on 𝒴\mathcal{Y}. Clearly, L:ζ𝐰ζ=(ζ(𝐲1),,ζ(𝐲d))L:\zeta\to\mathbf{w}_{\zeta}=(\zeta(\mathbf{y}_{1}),\ldots,\zeta(\mathbf{y}_{d}))^{\prime} is a linear one-to-one mapping between Υ±\Upsilon^{\pm} and d\mathbb{R}^{d}, that is, a linear isomorphism, with inverse L1:𝐰ζ𝐰L^{-1}:\mathbf{w}\to\zeta_{\mathbf{w}}. Note that the image of Υ\Upsilon in LL is \mathbb{P}, and the image of Υ\Upsilon_{*} in LL is \mathbb{P}_{*}. The latter observation means that Υ\Upsilon_{*} is isomorphic to \mathbb{P}_{*}; we therefore call Υ\Upsilon_{*} the “polytope of optimal designs”.

The Carathéodory theorem (e.g., [47], Section 17) says that for any bounded set SSr\SS\subset\mathbb{R}^{r}, every point in the convex hull of SS\SS can be expressed as a convex combination of at most r+1r+1 points of SS\SS. Taking Lemma 2 into account, we obtain:

Theorem 2.

For every design ζΥ\zeta\in\Upsilon there is a design ζsΥ\zeta^{s}\in\Upsilon with the support size of at most ss, such that 𝐌(ζ)=𝐌(ζs)\mathbf{M}(\zeta)=\mathbf{M}(\zeta^{s}). In particular, there exists an optimal design supported on no more than ss points.

Theorem 2 is a strengthening of the standard “Carathéodory theorem” used in optimal design (e.g., Theorem 2.2.3 in Fedorov [17] or Section 8.2. in Pukelsheim [44]): we obtain the standard result for s=qs=q, but in general sqs\leq q.

Recall that ζ𝐰Υ\zeta_{\mathbf{w}}\in\Upsilon is the design determined by the weights 𝐰\mathbf{w}\in\mathbb{P} and 𝐯1,,𝐯\mathbf{v}_{1},\ldots,\mathbf{v}_{\ell} are the vertices of \mathbb{P}_{*}. We will call the corresponding designs νj=ζ𝐯jΥ\nu_{j}=\zeta_{\mathbf{v}_{j}}\in\Upsilon_{*}, j=1,,j=1,\ldots,\ell, the “vertex optimal designs” (VODs). In simpler terms, for each vertex 𝐯j=(vj,1,,vj,d)\mathbf{v}_{j}=(v_{j,1},\ldots,v_{j,d})^{\prime} of \mathbb{P}_{*}, j{1,,}j\in\{1,\ldots,\ell\}, the uniquely determined vertex optimal design is

νj=[𝐲1𝐲2𝐲dvj,1vj,2vj,d]Υ.\nu_{j}=\left[\begin{array}[]{cccc}\mathbf{y}_{1}&\mathbf{y}_{2}&\cdots&\mathbf{y}_{d}\\ \hline\cr v_{j,1}&v_{j,2}&\cdots&v_{j,d}\\ \end{array}\right]\in\Upsilon_{*}.

Since the mapping 𝐰ζ𝐰\mathbf{w}\mapsto\zeta_{\mathbf{w}} is a linear isomorphism between \mathbb{P}_{*} and Υ\Upsilon_{*}, the role of the vertices 𝐯1,,𝐯\mathbf{v}_{1},\ldots,\mathbf{v}_{\ell} in \mathbb{P}_{*} is analogous to the role of the VODs ν1,,ν\nu_{1},\ldots,\nu_{\ell} in Υ\Upsilon_{*}.

As a consequence of Lemma 2 we see that the dimension of Υ\Upsilon_{*}, understood as the dimension of its affine hull in Υ±\Upsilon^{\pm}, is t=dst=d-s. The knowledge of tt and the inequality sms\geq m provides another variant of the Carathéodory theorem expressed in terms of the VODs:

Theorem 3.

Any optimal design can be represented as a convex combination of rr VODs, where rt+1r\leq t+1.

The main theorem characterizing the VODs, based directly on Theorem 1, is

Theorem 4.

The following statements are equivalent:

  1. 1.

    ν\nu is a vertex optimal design, i.e., ν\nu is the unique solution to the optimization problem min{i=1dciζ(𝐲i):ζΥ}\textstyle\min\{\sum_{i=1}^{d}c_{i}\zeta(\mathbf{y}_{i}):\zeta\in\Upsilon_{*}\} for some coefficients c1,,cdc_{1},\ldots,c_{d}\in\mathbb{R}.

  2. 2.

    ν\nu is an extreme optimal design, i.e., νΥ\nu\in\Upsilon_{*}, and if ζ1,ζ2Υ\zeta_{1},\zeta_{2}\in\Upsilon_{*} are such that ν=αζ1+(1α)ζ2\nu=\alpha\zeta_{1}+(1-\alpha)\zeta_{2} for α(0,1)\alpha\in(0,1) then ζ1=ζ2=ν\zeta_{1}=\zeta_{2}=\nu.

  3. 3.

    ν\nu is an optimal design with “independent support information matrices”, i.e., νΥ\nu\in\Upsilon_{*}, and the matrices 𝐟(𝐲)𝐟(𝐲)\mathbf{f}(\mathbf{y})\mathbf{f}^{\prime}(\mathbf{y}), for 𝐲𝒮(ν)\mathbf{y}\in\mathcal{S}(\nu), are linearly independent.

  4. 4.

    ν\nu is a minimal optimal design, i.e., νΥ\nu\in\Upsilon_{*}, and if ζΥ\zeta\in\Upsilon_{*} is such that 𝒮(ζ)𝒮(ν)\mathcal{S}(\zeta)\subseteq\mathcal{S}(\nu), then 𝒮(ζ)=𝒮(ν)\mathcal{S}(\zeta)=\mathcal{S}(\nu).

  5. 5.

    ν\nu is a unique optimal design on 𝒮(ν)\mathcal{S}(\nu), i.e., νΥ\nu\in\Upsilon_{*}, and if ζΥ\zeta\in\Upsilon_{*} is such that 𝒮(ζ)𝒮(ν)\mathcal{S}(\zeta)\subseteq\mathcal{S}(\nu), then ζ=ν\zeta=\nu.

Note that Theorem 4, specifically part 131\Leftrightarrow 3, implies that the support size of any VOD is between mm and ss. Theorem 4 also provides bounds on the number \ell of the VODs in terms of the numbers dd, ss and tt; see Appendix D.

Example 5.

Consider Model (1) from Examples 1-4. The vertices of the polytope Υ\Upsilon_{*} of all optimal designs correspond to the design weights 𝐯1\mathbf{v}_{1} and 𝐯2\mathbf{v}_{2} derived in Example 4. That is, there are two VODs ν1=ζ𝐯1\nu_{1}=\zeta_{\mathbf{v}_{1}} and ν2=ζ𝐯2\nu_{2}=\zeta_{\mathbf{v}_{2}}, which are uniform on the sets 𝒴1\mathcal{Y}_{1} and 𝒴2\mathcal{Y}_{2} containing points (1,1,1)(-1,-1,-1)^{\prime}, (+1,+1,1)(+1,+1,-1)^{\prime}, (+1,1,+1)(+1,-1,+1)^{\prime}, (1,+1,+1)(-1,+1,+1)^{\prime}, and (+1,+1,+1)(+1,+1,+1)^{\prime}, (1,1,+1)(-1,-1,+1)^{\prime}, (1,+1,1)(-1,+1,-1)^{\prime}, (+1,1,1)(+1,-1,-1)^{\prime}, respectively. The set Υ\Upsilon_{*} consists of designs αν1+(1α)ν2\alpha\nu_{1}+(1-\alpha)\nu_{2}, where α[0,1]\alpha\in[0,1], which assign the weight α/4\alpha/4 to all points of 𝒴1\mathcal{Y}_{1} and the weight (1α)/4(1-\alpha)/4 to all points of 𝒴2\mathcal{Y}_{2}. One can easily see that the VODs ν1\nu_{1} and ν2\nu_{2} indeed satisfy all characterizations given by Theorem 4.

2.5 Rational optimal design problems

In Examples 3 to 5, we provided the complete characterization of the optimal design polytope via its vertices for Model (1). However, such analytical derivations become tedious with increasing number of factors and more complex models.

Fortunately, the problems of finding all VODs for many standard models (including Model (1)) are “rational”, that is, the polytope of optimal weights has a rational H-representation (5). More precisely, it means that the matrices 𝐌\mathbf{M}_{*} and 𝐟i𝐟i\mathbf{f}_{i}\mathbf{f}^{\prime}_{i} (i=1,,di=1,\ldots,d) have rational elements. These conditions are automatically satisfied if 𝐟1,,𝐟d\mathbf{f}_{1},\ldots,\mathbf{f}_{d} have rational elements, and the weights of at least one optimal design are rational numbers.

For a rational optimal design problem, all VODs ν1,,ν\nu_{1},\ldots,\nu_{\ell} have rational weights. Moreover, a rational optimal design problem allows for the use of error-free rational-arithmetic algorithms to construct the theoretical forms of all VODs. In Section 3, we therefore employ computer-assisted construction of VODs that are (analytically) optimal, rather than the manual construction of such designs.

The rational form of the weights of all VODs means that they are exact designs of some sizes N1,,NN_{1},\ldots,N_{\ell}. It is of theoretical interest that in the rational optimal design problems, any optimal approximate design is a convex combination of at most t+1t+1 exact optimal designs; see Theorem 3. From the more practical point of view, the VODs can be combined to provide an entire pool of optimal designs that can be realized with a finite number of trials. In particular, for some N0N_{0} we can achieve an exact design of any size NN0N\geq N_{0} that is a multiple of the greatest common divisor gcd(N1,,N)\textrm{gcd}(N_{1},\ldots,N_{\ell}), cf. Nijenhuis and Wilf [40]. We illustrate this idea for all models analyzed in Section 3.

Finally, note that for a rational design problem, its rank ss can also be calculated in a perfectly reliable way using the rational arithmetic, even without the need to calculate the VODs. The problem rank alone provides the dimension of the set of all information matrices on 𝒴\mathcal{Y} (Lemma 2), an upper bound on the minimal support size (Theorem 2) and the dimension t=dst=d-s of the polytope Υ\Upsilon_{*} of optimal designs (Lemma 2).

2.6 Specific optimal designs

Suppose that the optimal design is not unique, and we have the complete list of all VODs, characterizing the set Υ\Upsilon_{*} of all optimal designs. Then we can select one or more optimal designs with special properties, to broaden the choice for the experimenter.

2.6.1 Optimal designs with small supports

Due to Theorem 4, the VODs exactly coincide with the minimal optimal designs. Since the set of VODs is finite, it is straightforward to determine all absolutely minimal optimal designs, by selecting the VODs with the smallest possible size of the support.

Special optimal designs with small supports can also be obtained by maximizing a strictly convex function Ψ\Psi over Υ\Upsilon_{*}. Such a Ψ\Psi can only attain its maximum at one or more of the extreme points of Υ\Upsilon_{*}; by Theorem 4, these extreme points are exactly VODs. For instance, since the negative entropy

ΨH(ζ)=𝐲𝒴ζ(𝐲)log2(ζ(𝐲))\Psi_{H}(\zeta)=\sum_{\mathbf{y}\in\mathcal{Y}}\zeta(\mathbf{y})\log_{2}(\zeta(\mathbf{y}))

is strictly convex on Υ\Upsilon_{*}, we are able to identify all minimum-entropy optimal designs by selecting the VODs with the smallest entropy.

There are numerous potential advantages to using optimal designs with small supports, as opposed to those with large supports: they can better facilitate replicated observations, are simpler and more economical to execute, and offer a more compact representation.

Another advantage of approximate designs with small supports, such as the VODs, is that they are generally more appropriate for the “rounding” algorithms, which convert approximate designs to the exact designs. For instance, the popular Efficient Rounding (see Pukelsheim and Rieder [45]) is only meaningful if the approximate design to be rounded has a support smaller than the required size NN of the exact design. The size of the support also appears in the formula for the lower bound on the efficiency of the resulting exact design, in the sense that the smaller the support, the better the bound (Section 7 in Pukelsheim and Rieder [45]).

Moreover, it is possible to “round” an optimal approximate design by applying an advanced mathematical programming method (e.g., Sagnol and Harman [49] or Ahipasaoglu [1]) at the support of an optimal approximate design, which generally provides better exact designs than Efficient Rounding. This approach can only be practically feasible if the associated optimization problem has a small number of variables, that is, if the support of the optimal approximate design is small.

2.6.2 Optimal designs with large supports

The list of all VODs also provides the complete characterization of all optimal designs whose support is maximum possible, i.e., with support 𝒴\mathcal{Y}. Recall that we call such designs maximal optimal designs. Every such design is of the form α1ν1++αν\alpha_{1}\nu_{1}+\cdots+\alpha_{\ell}\nu_{\ell}, where αj0\alpha_{j}\geq 0, jαj=1\sum_{j}\alpha_{j}=1,

j:αj>0𝒮(νj)=𝒴,\cup_{j\>:\>\alpha_{j}>0}\>\mathcal{S}(\nu_{j})=\mathcal{Y},

and vice-versa. For instance, we can construct a maximal optimal design with αj=1/\alpha_{j}=1/\ell for all j=1,,j=1,\ldots,\ell. In contrast to the minimally supported VODs, the set of all maximally-supported optimal designs corresponds to the relative interior of Υ\Upsilon_{*}.

It is also often possible to select an optimal design with a large support by minimizing a convex function Ψ\Psi over Υ\Upsilon_{*}. This can be performed by means of appropriate mathematical programming solvers applied to the problem

minΨ(ζ𝐰), s.t. 𝐰.\min\Psi(\zeta_{\mathbf{w}}),\text{ s.t. }\mathbf{w}\in\mathbb{P}_{*}. (7)

For instance, using solvers for convex optimization on ΨH\Psi_{H} under the appropriate linear constraints, we can obtain the unique maximum-entropy optimal design.

While it is possible to solve (7) without the knowledge of the vertices of \mathbb{P}_{*}, the vertices allow us to use two alternative approaches. First, applying the representation 𝕋\mathbb{T}_{*}, see (6), we can solve the convex optimization problem

minΨ(ζ𝐰),𝐰=𝐔𝐭+𝐰, s.t. 𝐔𝐭𝐰.\min\Psi(\zeta_{\mathbf{w}}),\mathbf{w}=\mathbf{U}\mathbf{t}+\mathbf{w}^{*},\text{ s.t. }\mathbf{U}\mathbf{t}\geq-\mathbf{w}^{*}. (8)

Here, the variable 𝐭\mathbf{t} is tt-dimensional. Second, we can use the fact that the vertices 𝐯1,,𝐯\mathbf{v}_{1},\ldots,\mathbf{v}_{\ell} of \mathbb{P}_{*} generate all optimal weights; therefore, we can solve the convex optimization problem

minΨ(ζ𝐰),𝐰=i=1ai𝐯i, s.t. 𝐚𝟎,𝟏𝐚=1,\textstyle\min\Psi(\zeta_{\mathbf{w}}),\mathbf{w}=\sum_{i=1}^{\ell}a_{i}\mathbf{v}_{i},\text{ s.t. }\mathbf{a}\geq\mathbf{0}_{\ell},\mathbf{1}^{\prime}_{\ell}\mathbf{a}=1, (9)

where the variable 𝐚\mathbf{a} is \ell-dimensional. For small tt or small \ell, these two restatements can be much simpler problems than (7). For a demonstration, see Section 3.3.

Optimal designs with large supports can sometimes be preferable to those with small supports, because they offer a more comprehensive exploration of the design space and are more robust with respect to misspecifications of the statistical model. Additionally, they are sometimes mathematically simpler, exhibiting a higher degree of symmetry (cf. Section 5.1. in Harman et al. [27]).

2.6.3 Cost-efficient optimal designs

Finally, consider a linear cost function Ψc\Psi_{c} defined on Υ\Upsilon_{*}:

Ψc(ζ)=𝐲𝒴c(𝐲)ζ(𝐲),\Psi_{c}(\zeta)=\sum_{\mathbf{y}\in\mathcal{Y}}c(\mathbf{y})\zeta(\mathbf{y}),

where c(𝐲)c(\mathbf{y})\in\mathbb{R} are the costs associated with the trial in 𝐲𝒴\mathbf{y}\in\mathcal{Y}. While a Ψc\Psi_{c}-minimizing optimal design can be computed by solving a problem of linear programming, the list of all VODs provides much more: a straightforward and complete characterization of the set of all optimal solutions. Indeed, let νj1,,νjL\nu_{j_{1}},\ldots,\nu_{j_{L}} be the VODs that minimize Ψc\Psi_{c}. Then the set of all convex combinations of these designs is the set of all Ψc\Psi_{c}-minimizing optimal designs. Geometrically, the set of Ψc\Psi_{c}-minimizing optimal designs is always a face of Υ\Upsilon_{*} (of some dimension between 0 and tt). For an illustration, see Section 3.2.

As a special case, we can directly obtain the range of the optimal weights at 𝐲𝒴\mathbf{y}\in\mathcal{Y}, for instance as a means to explore the possible costs associated with the trials at 𝐲\mathbf{y}. For 𝐲\mathbf{y}, the optimal weights ζ(𝐲)\zeta^{*}(\mathbf{y}), ζΥ\zeta^{*}\in\Upsilon_{*}, range in the (possibly degenerate) interval

W(𝐲)=[minνΥν(𝐲),maxνΥν(𝐲)],W(\mathbf{y})=\left[\min_{\nu\in\Upsilon_{*}}\nu(\mathbf{y}),\max_{\nu\in\Upsilon_{*}}\nu(\mathbf{y})\right],

where νν(𝐲)\nu\to\nu(\mathbf{y}) is linear on Υ\Upsilon_{*}. Note that the interval W(𝐲)W(\mathbf{y}) is in fact a one-dimensional projection of \mathbb{P}_{*}. Analogously, we can compute and visualize projections of \mathbb{P}_{*} of dimensions greater than 11, which can provide complete information on the possible ranges of subvectors of optimal weights; see Section 3.4 for illustrations of such projections.

3 Examples

In this section, we focus on optimal designs for several common regression models with kk factors. In each of these models, the uniform design on the maximum optimal support 𝒴\mathcal{Y} is a maximal optimal design (see Section 2.2). The size dd of 𝒴\mathcal{Y} grows superpolynomially with kk, while the number mm of parameters only grows polynomially. The standard Carathéodory theorem (cf. Theorem 2) implies that there always exists an optimal design with the support size of at most m(m+1)/2m(m+1)/2. Therefore, there exists some kk^{*} such that the optimal design is not unique for all kkk\geq k^{*}.

Furthermore, each of the optimal design problems from this section is a rational problem (Section 2.5), as can be easily seen from the form of the model and the maximal optimal design. Therefore, we can use the rational arithmetic to identify the characteristics of the polytope of optimal designs, specifically its affine dimension tt and, most importantly, the sets of all VODs, cf. Sections 2.3 and 2.4.666We provide a description of the VODs for kk smaller than a limit determined by the capacity of our computational equipment. For some larger values of kk, we were only able to compute the rank ss of the problem, which provides the dimension tt of the polytope of optimal designs by Lemma 2.

For each model in this section, the minimum-entropy optimal designs (cf. Section 2.6.1) coincide with the absolutely minimal optimal designs, and the unique maximum-entropy optimal design (cf. Section 2.6.2) is the uniform design on 𝒴\mathcal{Y}.

Our calculations were performed with the help of the statistical software R; in particular, the rational polytope vertices were obtained using the packages Lucas et al. [37] and Geyer and Meeden [23]. We used a 64-bit Windows 11 system with an Intel Core i7-9750H processor operating at 2.60 GHz. The R codes and full detailed tables of all VODs from Section 3 are available upon request from the authors.

3.1 The first-degree model on [0,1]k[0,1]^{k} without a constant term

Consider the model

E(Y𝐱)\displaystyle E(Y_{\mathbf{x}}) =\displaystyle= β1x1++βkxk,\displaystyle\beta_{1}x_{1}+\cdots+\beta_{k}x_{k}, (10)
𝐱\displaystyle\mathbf{x} =\displaystyle= (x1,,xk)𝒳=[0,1]k.\displaystyle(x_{1},\ldots,x_{k})^{\prime}\in\mathcal{X}=[0,1]^{k}.

The case of a single factor (k=1k=1) is trivial and for two factors (k=2k=2), all Φp\Phi_{p}-optimal designs are unique; see Section 8.6 in Pukelsheim [44].

Let k3k\geq 3. For j=1,,kj=1,\ldots,k, let 𝒱(j)\mathcal{V}(j) be the set of all vectors in {0,1}k\{0,1\}^{k} with exactly jj components equal to 11. The following claims are straightforward to verify via (2) and (4); cf. Pukelsheim [44], page 376: If kk is odd, the maximum optimal support is 𝒴D=𝒴A=𝒱(k/2+1/2)\mathcal{Y}_{D}=\mathcal{Y}_{A}=\mathcal{V}(k/2+1/2) for both DD- and AA-optimality. If kk is even, the maximum optimal support is 𝒴D=𝒱(k/2)𝒱(k/2+1)\mathcal{Y}_{D}=\mathcal{V}(k/2)\cup\mathcal{V}(k/2+1) for DD-optimality and 𝒴A=𝒱(k/2)\mathcal{Y}_{A}=\mathcal{V}(k/2) for AA-optimality. For any kk, the design uniform on 𝒴D\mathcal{Y}_{D} is maximal DD-optimal, and the design uniform on 𝒴A\mathcal{Y}_{A} is maximal AA-optimal.

Consequently, it is possible to apply rational arithmetic calculations to prove that for k5k\leq 5 the uniform optimal design on 𝒴D\mathcal{Y}_{D} is the unique DD-optimal design, and the uniform design on 𝒴A\mathcal{Y}_{A} is the unique AA-optimal design.

For k=6k=6 and the DD-criterion, the dimension of Υ\Upsilon_{*} is t=14t=14, and there are =150\ell=150 VODs. To describe the set of all VODs, consider two designs to be isomorphic, if they are the same up to a relabeling of the factors. Then, the VODs can be split into two classes of isomorphic designs. The first class consists of the 3030 VODs that are isomorphic to

[00011110110011101010111010011100110101101017171717171717].\left[\begin{array}[]{ccccccc}0&0&0&1&1&1&1\\ 0&1&1&0&0&1&1\\ 1&0&1&0&1&0&1\\ 1&1&0&1&0&0&1\\ 1&1&0&0&1&1&0\\ 1&0&1&1&0&1&0\\ \hline\cr\frac{1}{7}&\frac{1}{7}&\frac{1}{7}&\frac{1}{7}&\frac{1}{7}&\frac{1}{7}&\frac{1}{7}\\ \end{array}\right]. (11)

All 3030 VODs in this class can be obtained by permuting the (first 66) rows of the table. Note that permuting the columns corresponds to the same VOD. One of these VODs has been computed by integer quadratic programming as an exact DD-optimal design of size 7K7K, KK\in\mathbb{N}; see Table 1 in Filová and Harman [18]. For k=6k=6, these 3030 VODs are the absolutely minimal DD-optimal designs.

The second class of VODs for DD-optimality and k=6k=6 consists of the 120120 VODs isomorphic to the uniform design

[000000000110001111110011100011100111011001010111110101010110111011121121121121121121121121121121121\left[\begin{array}[]{ccccccccccc}0&0&0&0&0&0&0&0&0&1&1\\ 0&0&0&1&1&1&1&1&1&0&0\\ 1&1&1&0&0&0&1&1&1&0&0\\ 1&1&1&0&1&1&0&0&1&0&1\\ 0&1&1&1&1&1&0&1&0&1&0\\ 1&0&1&1&0&1&1&1&0&1&1\\ \hline\cr\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}\\ \end{array}\right.
111111111100001111110111000111100101100110111000101100001100121121121121121121121121121121].\left.\begin{array}[]{cccccccccc}1&1&1&1&1&1&1&1&1&1\\ 0&0&0&0&1&1&1&1&1&1\\ 0&1&1&1&0&0&0&1&1&1\\ 1&0&0&1&0&1&1&0&0&1\\ 1&0&1&1&1&0&0&0&1&0\\ 1&1&0&0&0&0&1&1&0&0\\ \hline\cr\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}&\frac{1}{21}\\ \end{array}\right]. (12)

For k=6k=6 and for the criterion of AA-optimality, Υ\Upsilon_{*} has dimension t=5t=5, and there are =12\ell=12 VODs, all isomorphic to

[000001111100111000111100100101110100101001110101001010111000110110110110110110110110110110].\left[\begin{array}[]{cccccccccc}0&0&0&0&0&1&1&1&1&1\\ 0&0&1&1&1&0&0&0&1&1\\ 1&1&0&0&1&0&0&1&0&1\\ 1&1&0&1&0&0&1&0&1&0\\ 0&1&1&1&0&1&0&1&0&0\\ 1&0&1&0&1&1&1&0&0&0\\ \hline\cr\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}&\frac{1}{10}\\ \end{array}\right]. (13)

One of these VODs has been identified by integer quadratic programming as an AA-optimal exact design of size 10K10K, KK\in\mathbb{N}; see Table 2 in Filová and Harman [18]. For k=6k=6, these 1212 VODs are the absolutely minimal AA-optimal designs. A 33-dimensional projection of \mathbb{P}_{*} and its 1212 vertices are depicted in Figure 1.

Refer to caption
Figure 1: A 33-dimensional projection of the 55-dimensional polytope \mathbb{P}_{*} of AA-optimal weights for Model (10) with k=6k=6 factors. The projections of the 1212 vertices of \mathbb{P}_{*} are depicted by black dots.

For k=7k=7, the sets of DD- and AA-optimal designs coincide. The dimension of Υ\Upsilon_{*} is t=14t=14 and it has =150\ell=150 vertices; their structure is similar to the case of DD-optimality for k=6k=6.

For k=8k=8 and the criterion of DD optimality, it holds that the size of the maximum optimal support is d=126d=126, the problem rank is s=36s=36, and the dimension of Υ\Upsilon_{*} is t=90t=90. For k=8k=8 and the criterion of AA-optimality we have d=70d=70, s=38s=38 and t=42t=42. We were not able to enumerate the individual VODs for this size.

Relations to block designs. For Model (10), a k×bk\times b matrix of support points of a design on 𝒴D\mathcal{Y}_{D} or 𝒴A\mathcal{Y}_{A} can be viewed as the incidence matrix of a binary block design with kk treatments and bb blocks. In this view, the support points of the 3030 VODs isomorphic to (11) are partially balanced designs (PBDs; see Wilson [54]) with r=4r=4, λ=2\lambda=2, and b=7b=7. Here, rr is the common replication number, and λ\lambda the common concurrence. Similarly, the 120120 VODs isomorphic to (12) are PBDs with r=12r=12, λ=6\lambda=6, and b=21b=21. In addition, the support points of (13) correspond to a balanced incomplete block design (which is a PBD with a constant block size) for 66 treatments, r=5r=5, λ=2\lambda=2, b=10b=10, and a common block size 33 (cf. Pukelsheim [44], p. 378, Cheng [8]).

3.2 The first-degree model on [1,+1]k[-1,+1]^{k} without a constant term

In this section we consider the model

E(Y𝐱)\displaystyle E(Y_{\mathbf{x}}) =\displaystyle= β1x1++βkxk,\displaystyle\beta_{1}x_{1}+\cdots+\beta_{k}x_{k}, (14)
𝐱\displaystyle\mathbf{x} =\displaystyle= (x1,,xk)𝒳=[1,+1]k,\displaystyle(x_{1},\ldots,x_{k})^{\prime}\in\mathcal{X}=[-1,+1]^{k},

where k2k\geq 2 (the case k=1k=1 is trivial). For this model, it is simple to see that the design ξ\xi^{*} uniform on 𝒴={1,+1}k\mathcal{Y}=\{-1,+1\}^{k} satisfies 𝐌(ξ)=𝐈m\mathbf{M}(\xi^{*})=\mathbf{I}_{m}, and use conditions (3) and (4) to verify that ξ\xi^{*} is a maximal optimal design with respect to all Φp\Phi_{p}, p(,0]p\in(-\infty,0]; that is, the polytope of optimal designs does not depend on the choice of the Kiefer’s criterion. The maximal optimal design is not uniquely optimal for any kk. Clearly, the optimal design problem is rational for any kk and we can employ rational-arithmetic calculations to derive the following characterizations.

For k=2k=2, the dimension of Υ\Upsilon_{*} is t=2t=2, and there are =4\ell=4 VODs, each uniform on a two-point support. If we denote the elements of 𝒴\mathcal{Y} as 𝐲1=(1,1),𝐲2=(1,+1),𝐲3=(+1,1)\mathbf{y}_{1}=(-1,-1)^{\prime},\mathbf{y}_{2}=(-1,+1)^{\prime},\mathbf{y}_{3}=(+1,-1)^{\prime}, and 𝐲4=(+1,+1)\mathbf{y}_{4}=(+1,+1)^{\prime}, the VODs are uniform designs on the pairs {𝐲1,𝐲2}\{\mathbf{y}_{1},\mathbf{y}_{2}\}, {𝐲1,𝐲3}\{\mathbf{y}_{1},\mathbf{y}_{3}\}, {𝐲2,𝐲4}\{\mathbf{y}_{2},\mathbf{y}_{4}\} and {𝐲3,𝐲4}\{\mathbf{y}_{3},\mathbf{y}_{4}\}. These designs are the absolutely minimal optimal designs. They can be combined to construct exact optimal designs of sizes N=2KN=2K, KK\in\mathbb{N}.

For Model (14), we consider two designs to be isomorphic, if one can be obtained from the other one by any sequence of operations involving relabeling the factors, reverting the signs of any subset of the factors, and reverting all signs of the factors for any subset of the support points. For a design matrix representing a VOD, these operations mean permuting the rows, multiplying any set of rows by 1-1, and multiplying the factor levels of any set of columns by 1-1.

For k=3k=3, the dimension of Υ\Upsilon_{*} is t=4t=4, and there are =16\ell=16 VODs, all isomorphic to

[+111+1+11+11+1+11114141414].\left[\begin{array}[]{cccc}+1&-1&-1&+1\\ +1&-1&+1&-1\\ +1&+1&-1&-1\\ \hline\cr\frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\\ \end{array}\right].

Note that the matrix of the support points is a submatrix of a 4×44\times 4 Hadamard matrix. These VODs are the absolutely minimal optimal designs. They can be combined to construct a variety of optimal designs that are exact of sizes N=4KN=4K, KK\in\mathbb{N}.

For k=4k=4, the dimension of Υ\Upsilon_{*} is t=9t=9, and there are =32\ell=32 VODs, all isomorphic to

[+1+1+1+1+111+1+11+11+1+11114141414].\left[\begin{array}[]{cccc}+1&+1&+1&+1\\ +1&-1&-1&+1\\ +1&-1&+1&-1\\ +1&+1&-1&-1\\ \hline\cr\frac{1}{4}&\frac{1}{4}&\frac{1}{4}&\frac{1}{4}\\ \end{array}\right].

The support points form a 4×44\times 4 Hadamard matrix. In fact, the uniform design on the rows of any Hadamard matrix of order 44 is a VOD and the support vectors of any VOD form a Hadamard matrix of order 44. These 3232 VODs are the absolutely minimal optimal designs. They can be combined to construct exact optimal designs of sizes N=4KN=4K, KK\in\mathbb{N}.

For k=5k=5, the dimension of Υ\Upsilon_{*} is t=21t=21, and there are as many as =35328\ell=35328 VODs. Of these, 25602560 are uniform designs on an 88-point support (i.e., they are exact of size N=8N=8) and 3267832678 are nonuniform, supported on 1111 points (N=12N=12). They can be used to construct exact optimal designs of sizes N=4KN=4K for KK\in\mathbb{N}, K2K\geq 2.

For k=6k=6, the size of the maximum optimal support is d=64d=64, the problem rank is s=16s=16 and the dimension of Υ\Upsilon_{*} is t=48t=48; for this size of the problem, the enumeration of the VODs was prohibitive for our computer equipment.

Designs minimizing a linear cost. As mentioned in Section 2.6.3, the list of all VODs allows us to completely describe the class of linear-cost minimizing optimal designs. Consider Model (14) with k=5k=5 factors and assume that the cost of the trial in 𝐲{1,+1}5\mathbf{y}\in\{-1,+1\}^{5} is c=an+bn+c=an_{-}+bn_{+}, with any a<ba<b, where nn_{-} is the number of levels 1-1 and n+n_{+} is the number of levels +1+1 in 𝐲\mathbf{y}. It is a matter of simple enumeration to determine that the VOD

[+1111+11111+1111+11111+1111+11111+1111+11111+1+1+1+11818181818181818]\left[\begin{array}[]{cccccccc}+1&-1&-1&-1&+1&-1&-1&-1\\ -1&+1&-1&-1&-1&+1&-1&-1\\ -1&-1&+1&-1&-1&-1&+1&-1\\ -1&-1&-1&+1&-1&-1&-1&+1\\ -1&-1&-1&-1&+1&+1&+1&+1\\ \hline\cr\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}\\ \end{array}\right] (15)

minimizes this cost, as does any of the 44 VODs that can be obtained from (15) by relabeling the factors. However, the VOD

[+1111111+1111+111+111+1111+1111111+1116112112112112112\left[\begin{array}[]{cccccc}+1&-1&-1&-1&-1&-1\\ -1&+1&-1&-1&-1&+1\\ -1&-1&+1&-1&-1&+1\\ -1&-1&-1&+1&-1&-1\\ -1&-1&-1&-1&+1&-1\\ \hline\cr\frac{1}{6}&\frac{1}{12}&\frac{1}{12}&\frac{1}{12}&\frac{1}{12}&\frac{1}{12}\\ \end{array}\right.
11111+11+1111+11+11+1+111+111+1+1+1112112112112112]\left.\begin{array}[]{ccccc}-1&-1&-1&-1&-1\\ +1&-1&+1&-1&-1\\ -1&+1&-1&+1&-1\\ +1&+1&-1&-1&+1\\ -1&-1&+1&+1&+1\\ \hline\cr\frac{1}{12}&\frac{1}{12}&\frac{1}{12}&\frac{1}{12}&\frac{1}{12}\\ \end{array}\right] (16)

also minimizes this cost, as does any of the 44 VODs that can be obtained from (16) by relabeling the factors. That is, the set of all cost-minimizing optimal designs is the set of all convex combinations of these 1010 VODs. Note that they provide exact optimal designs of sizes N=4KN=4K for KK\in\mathbb{N}, K2K\geq 2.

In Sections 3.3, 3.4, 3.5, we consider two designs to be isomorphic, if one can be obtained from the other one by a sequence of relabeling the factors and reverting the signs of a subset of the factors.

3.3 The first-degree model on [1,+1]k[-1,+1]^{k}

Here we consider the model

E(Y𝐱)\displaystyle E(Y_{\mathbf{x}}) =\displaystyle= β1+β2x1++βk+1xk,\displaystyle\beta_{1}+\beta_{2}x_{1}+\cdots+\beta_{k+1}x_{k}, (17)
𝐱\displaystyle\mathbf{x} =\displaystyle= (x1,,xk)𝒳=[1,+1]k,\displaystyle(x_{1},\ldots,x_{k})^{\prime}\in\mathcal{X}=[-1,+1]^{k},

for k1k\geq 1. For this model, it can again be readily shown that the design ξ\xi^{*} uniform on 𝒴={1,+1}k\mathcal{Y}=\{-1,+1\}^{k} has the information matrix 𝐈m\mathbf{I}_{m}, and it is possible to use conditions (3) and (4) to verify that ξ\xi^{*} is a maximal optimal design with respect to all Kiefer’s criteria. Additionally, the optimal design problem itself is clearly rational.

The uniform design on 𝒴\mathcal{Y} is the unique optimal design if k=1k=1 and k=2k=2. For k=3k=3, the dimension of Υ\Upsilon_{*} is t=1t=1, and there are =2\ell=2 VODs, as described in Examples 1-5 for α=0\alpha=0 and α=1\alpha=1. Both these VODs are exact designs of size N=4N=4, absolutely minimal optimal designs, and they can be combined to form exact optimal designs of sizes N=4KN=4K for any KK\in\mathbb{N}.

The situation is more complex for k=4k=4: the dimension of Υ\Upsilon_{*} is t=5t=5, and there are =26\ell=26 VODs, which can be split into 33 classes of isomorphic designs. The first subclass includes 88 uniform designs isomorphic to

[+1+1+1+11111+111+1+111+1+11+11+11+11+1+111+1+1111818181818181818].\left[\begin{array}[]{cccccccc}+1&+1&+1&+1&-1&-1&-1&-1\\ +1&-1&-1&+1&+1&-1&-1&+1\\ +1&-1&+1&-1&+1&-1&+1&-1\\ +1&+1&-1&-1&+1&+1&-1&-1\\ \hline\cr\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}\\ \end{array}\right].

This matrix of support points is formed by concatenating a normalized Hadamard matrix 𝐇4\mathbf{H}_{4} of order 44 and the matrix constructed from 𝐇4\mathbf{H}_{4} by reverting the signs of the first row. Its transpose forms an 8×48\times 4 orthogonal array with two levels of strength 22.

The second class of VODs consists of two designs isomorphic to the uniform design

[+1+1+1+11111+111+11+1+11+11+111+11+1+1+11111+1+11818181818181818].\left[\begin{array}[]{cccccccc}+1&+1&+1&+1&-1&-1&-1&-1\\ +1&-1&-1&+1&-1&+1&+1&-1\\ +1&-1&+1&-1&-1&+1&-1&+1\\ +1&+1&-1&-1&-1&-1&+1&+1\\ \hline\cr\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}&\frac{1}{8}\\ \end{array}\right].

This support matrix can be constructed as [𝐇4,𝐇4][\mathbf{H}_{4},-\mathbf{H}_{4}]. Its transpose forms an 8×48\times 4 orthogonal array of strength 33. The VODs belonging to these two classes of uniform designs supported on 88 points are the absolutely minimal optimal designs.

The third class of isomorphic VODs for k=4k=4 contains 1616 designs, which are no longer uniform. They are all isomorphic to the following design supported on 1111 points:

[111+11111+111+11+111+11+1111+1+1112112112112112112\left[\begin{array}[]{cccccc}-1&-1&-1&+1&-1&-1\\ -1&-1&+1&-1&-1&+1\\ -1&+1&-1&-1&+1&-1\\ +1&-1&-1&-1&+1&+1\\ \hline\cr\frac{1}{12}&\frac{1}{12}&\frac{1}{12}&\frac{1}{12}&\frac{1}{12}&\frac{1}{12}\\ \end{array}\right.
1+1+1+1+1+111+1+1+11+11+11+111+111211211211216].\left.\begin{array}[]{ccccc}-1&+1&+1&+1&+1\\ +1&-1&-1&+1&+1\\ +1&-1&+1&-1&+1\\ -1&+1&-1&-1&+1\\ \hline\cr\frac{1}{12}&\frac{1}{12}&\frac{1}{12}&\frac{1}{12}&\frac{1}{6}\\ \end{array}\right].

If the point with weight 1/61/6 were to be replicated twice, then the support points would form a transpose of a 12×412\times 4 orthogonal array with two levels of strength 22. The 2626 VODs for k=4k=4 provide exact designs of sizes N=4KN=4K, KK\in\mathbb{N}, K2K\geq 2.

For k=5k=5, the dimension of Υ\Upsilon_{*} is t=16t=16, and the number of VODs is =14110\ell=14110. The support sizes of the VODs are 88 (they are all exact designs of size N=8N=8), 1111 (N=12N=12), 1212 (N=12N=12), 1313 (N=20N=20), 1515 (N=24N=24), and 1616 (N=16,28,32,36N=16,28,32,36). The VODs can be combined to provide exact designs of sizes N=4KN=4K, KK\in\mathbb{N}, K2K\geq 2.

For k=6k=6, the size of the maximum optimal support is d=64d=64, the problem rank is s=22s=22 and the dimension of Υ\Upsilon_{*} is t=42t=42. The enumeration of the VODs for k=6k=6 exceeded the capabilities of our computer equipment.

Optimal designs minimizing a convex function. To illustrate the search for an optimal design that minimizes a convex function (cf. Section 2.6.2), consider Model (17) with k=4k=4 and the problem

minΨ(i=1driζ(𝐲i)𝐟i𝐟i), s.t. ζΥ,\min\Psi\left(\sum_{i=1}^{d}r_{i}\zeta(\mathbf{y}_{i})\mathbf{f}_{i}\mathbf{f}^{\prime}_{i}\right),\text{ s.t. }\zeta\in\Upsilon_{*}, (18)

where r1,,rd(0,1]r_{1},\ldots,r_{d}\in(0,1] are given coefficients, and Ψ=Φ0\Psi=-\Phi_{0}, i.e., the negative criterion of DD-optimality. The problem (18) can be interpreted as optimizing Φ0\Phi_{0} on Υ\Upsilon_{*} in a heteroscedastic version of (17), with variances of the observations in 𝐲1,,𝐲d\mathbf{y}_{1},\ldots,\mathbf{y}_{d} proportional to 1/r1,,1/rd1/r_{1},\ldots,1/r_{d}. An alternative interpretation is optimizing Φ0\Phi_{0} on Υ\Upsilon_{*} in (18) for proportions 1r1,,1rd1-r_{1},\ldots,1-r_{d} of failed trials in 𝐲1,,𝐲d\mathbf{y}_{1},\ldots,\mathbf{y}_{d}.

For illustration, we chose the coefficients ri=1r_{i}=1 for 𝐲i=(1,1,1,1)\mathbf{y}_{i}=(-1,-1,-1,-1)^{\prime}, ri=0.95r_{i}=0.95 for all four 𝐲i\mathbf{y}_{i} that contain one level +1+1, ri=0.85r_{i}=0.85 for all six 𝐲i\mathbf{y}_{i} that contain two levels +1+1, ri=0.70r_{i}=0.70 for all four 𝐲i\mathbf{y}_{i} that contain three levels +1+1 and ri=0.50r_{i}=0.50 for 𝐲i=(+1,+1,+1,+1)\mathbf{y}_{i}=(+1,+1,+1,+1)^{\prime}.

We formulated (18) in the form (8) based on 𝕋\mathbb{T}_{*}. Instead of this form, one can also use the forms (7) or (9), but our version works with 𝐭5\mathbf{t}\in\mathbb{R}^{5}, whereas (7) is based on 𝐰16\mathbf{w}\in\mathbb{R}^{16} and (9) is based on 𝐚26\mathbf{a}\in\mathbb{R}^{26}. Moreover, (8) requires maximizing a concave function over a feasible set with a guaranteed interior point 𝟎5𝕋\mathbf{0}_{5}\in\mathbb{T}_{*}. This allowed us to solve the problem with a function as simple as constrOptim of R. This procedure provided the design

[1+1+11+111+11+11+111+1+1111111+1+10.1560.0940.0940.0940.0940.094\left[\begin{array}[]{cccccc}-1&+1&+1&-1&+1&-1\\ -1&+1&-1&+1&-1&+1\\ -1&-1&+1&+1&-1&-1\\ -1&-1&-1&-1&+1&+1\\ \hline\cr 0.156&0.094&0.094&0.094&0.094&0.094\\ \end{array}\right.
1+1+1+11+11+1+11+1+1+1+11+1+1+1+11+1+1+1+10.0940.0620.0620.0620.0620.032].\left.\begin{array}[]{cccccc}-1&+1&+1&+1&-1&+1\\ -1&+1&+1&-1&+1&+1\\ +1&+1&-1&+1&+1&+1\\ +1&-1&+1&+1&+1&+1\\ \hline\cr 0.094&0.062&0.062&0.062&0.062&0.032\\ \end{array}\right].

3.4 The first-degree model on [1,+1]k[-1,+1]^{k} with second-order interactions

Consider the model

E(Y𝐱)\displaystyle E(Y_{\mathbf{x}}) =\displaystyle= β1+i=1kβi+1xi+1i<jkβi,jxixj,\displaystyle\beta_{1}+\sum_{i=1}^{k}\beta_{i+1}x_{i}+\sum_{1\leq i<j\leq k}\beta_{i,j}x_{i}x_{j}, (19)
𝐱\displaystyle\mathbf{x} =\displaystyle= (x1,,xk)𝒳=[1,+1]k,\displaystyle(x_{1},\ldots,x_{k})^{\prime}\in\mathcal{X}=[-1,+1]^{k},

where k2k\geq 2 is the number of factors, and m=1+k+(k2)m=1+k+\binom{k}{2} is the number of parameters. Model (19) is again a rational model, and the design ξ\xi^{*} uniform on 𝒴={1,+1}k\mathcal{Y}=\{-1,+1\}^{k} satisfying 𝐌(ξ)=𝐈m\mathbf{M}(\xi^{*})=\mathbf{I}_{m} is a maximal Φp\Phi_{p}-optimal design for any p(,0]p\in(-\infty,0], as can be confirmed by verifying conditions (3) and (4). It is the unique optimal design for k=2,3,4k=2,3,4.

However, for k=5k=5, an infinite number of optimal designs already exist; the dimension of Υ\Upsilon_{*} is t=1t=1, and there are =2\ell=2 VODs, mutually isomorphic. One of these two VODs is the uniform design

[111111111111+1+1+1+111+1+111+1+11+11+11+11+1+111+11+1+11116116116116116116116116\left[\begin{array}[]{cccccccc}-1&-1&-1&-1&-1&-1&-1&-1\\ -1&-1&-1&-1&+1&+1&+1&+1\\ -1&-1&+1&+1&-1&-1&+1&+1\\ -1&+1&-1&+1&-1&+1&-1&+1\\ +1&-1&-1&+1&-1&+1&+1&-1\\ \hline\cr\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}\\ \end{array}\right.
+1+1+1+1+1+1+1+11111+1+1+1+111+1+111+1+11+11+11+11+11+1+11+111+1116116116116116116116116].\left.\begin{array}[]{cccccccc}+1&+1&+1&+1&+1&+1&+1&+1\\ -1&-1&-1&-1&+1&+1&+1&+1\\ -1&-1&+1&+1&-1&-1&+1&+1\\ -1&+1&-1&+1&-1&+1&-1&+1\\ -1&+1&+1&-1&+1&-1&-1&+1\\ \hline\cr\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}&\frac{1}{16}\\ \end{array}\right].

Its support points form the transpose of a 16×516\times 5 orthogonal array of strength 44. Both of these VODs are absolutely minimal optimal designs. They can be combined to obtain exact optimal designs of sizes N=16KN=16K, KK\in\mathbb{N}.

For k=6k=6, the dimension of Υ\Upsilon_{*} is t=7t=7 and there are =78\ell=78 VODs. The visualizations of selected 22-dimensional projections of the corresponding \mathbb{P}_{*} are in Figure 2. Of these 7878 VODs, 1414 are uniform on a 3232-point support (exact of sizes N=32N=32), and 6464 are nonuniform on a 5757-point support (N=80N=80). Combining these VODs yields exact optimal designs of sizes N=32N=32 and N=16KN=16K for all K4K\geq 4.

Refer to caption
Figure 2: Projections of the polytope \mathbb{P}_{*} for Model (19) with k=6k=6 factors to two-dimensional planes determined by various pairs of coordinate axes. More precisely, the panels (a), (b), and (c) depict the pairs (ζ(𝐲i1),ζ(𝐲i2))(\zeta^{*}(\mathbf{y}_{i_{1}}),\zeta^{*}(\mathbf{y}_{i_{2}})) for optimal designs ζΥ\zeta^{*}\in\Upsilon_{*}, where i1,i2{1,,d}i_{1},i_{2}\in\{1,\ldots,d\} are such that 𝐲i1𝐲i2{2,0}\mathbf{y}^{\prime}_{i_{1}}\mathbf{y}_{i_{2}}\in\{-2,0\}, 𝐲i1𝐲i2{4,2}\mathbf{y}^{\prime}_{i_{1}}\mathbf{y}_{i_{2}}\in\{-4,2\} and 𝐲i1𝐲i2{6,4}\mathbf{y}^{\prime}_{i_{1}}\mathbf{y}_{i_{2}}\in\{-6,4\}, respectively. The projections of the 7878 vertices of \mathbb{P}_{*} are depicted by black dots. Note that the projections of groups of vertices coincide: the number near a dot signifies the number of vertices whose projections overlap.

The computations were prohibitive for Model (19) with k7k\geq 7.

3.5 The additive second-degree model on [1,+1]k[-1,+1]^{k}

As the last example, we consider the model

E(Y𝐱)\displaystyle E(Y_{\mathbf{x}}) =\displaystyle= β1+i=1kβi+1xi+i=1kβi+1+kxi2,\displaystyle\beta_{1}+\sum_{i=1}^{k}\beta_{i+1}x_{i}+\sum_{i=1}^{k}\beta_{i+1+k}x_{i}^{2}, (20)
𝐱\displaystyle\mathbf{x} =\displaystyle= (x1,,xk)𝒳=[1,+1]k,\displaystyle(x_{1},\ldots,x_{k})^{\prime}\in\mathcal{X}=[-1,+1]^{k},

where k1k\geq 1, and the number of parameters is m=2k+1m=2k+1. Model (20) is rational for DD-optimality, as it is possible to verify that the uniform design on 𝒴={1,0,1}k\mathcal{Y}=\{-1,0,1\}^{k} is a maximal DD-optimal777For this model, we consider only the criterion of DD-optimality, since analytic formulas of other Φp\Phi_{p}-optimal designs for Model (20) with k>1k>1 do not seem to be available in the literature, nor trivial to derive. design; cf. Pukelsheim [44] (Sections 9.5 and 9.9) for k=1k=1, and Schwabe and Wierich [50] for k>1k>1. For k=1k=1 and k=2k=2, the uniform design on 𝒴\mathcal{Y} is the unique DD-optimal design.

However, for k=3k=3 the dimension of the polytope of optimal designs is already t=8t=8 and there are =66\ell=66 VODs. First, there are 1212 VODs, uniform on 99 points, which form two distinct isomorphism classes. A representative of the 44 members of the first class of VODs is the design

[1+10+10101+110+110+110+1111000+1+1+1191919191919191919].\left[\begin{array}[]{rrrrrrrrr}-1&+1&0&+1&0&-1&0&-1&+1\\ -1&0&+1&-1&0&+1&-1&0&+1\\ -1&-1&-1&0&0&0&+1&+1&+1\\ \hline\cr\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}\\ \end{array}\right]. (21)

A representative of the 88 members of the second class of VODs is the design

[1+1001+1+10110+110+110+1111000+1+1+1191919191919191919].\left[\begin{array}[]{rrrrrrrrr}-1&+1&0&0&-1&+1&+1&0&-1\\ -1&0&+1&-1&0&+1&-1&0&+1\\ -1&-1&-1&0&0&0&+1&+1&+1\\ \hline\cr\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}&\frac{1}{9}\\ \end{array}\right]. (22)

The support points of these 1212 VODs form the transpose of an orthogonal array of strength 22 with v=3v=3 levels and k=3k=3 factors. The nice geometric properties of these designs are apparent from Figure 3. These VODs are absolutely minimal optimal designs.

Refer to caption
Figure 3: Vertex DD-optimal designs for Model (20) with k=3k=3 factors. The panel (a) displays a representative of the class of 44 isomorphic VODs uniform on a 99-point support. More precisely, the gray circles depict the support points of the VOD (21). Similarly, the panel (b) displays the representative (22) of the class of 88 isomorphic VODs uniform on a 99-point support, and the panel (c) displays the design (23) which is one of the remaining 5454 non-uniform VODs supported on 1717 points. The maximum optimal support {1,0,+1}3\{-1,0,+1\}^{3} is visualized by black dots, and the edges of the cube [1,+1]3[-1,+1]^{3} are visualized by dashed lines.

The remaining 5454 VODs are all designs supported on 1717 design points from 𝒴\mathcal{Y}, such that 1616 of these points have weight 1/181/18, one point has weight 1/91/9, and the projections to all three two-dimensional sub-spaces of these designs are uniform on the 3×33\times 3 grid. They form 55 distinct isomorphism classes which we will not detail. One of these VODs is

[1111110001100+1+1110101+10+11+1011811811811811811811811819\left[\begin{array}[]{rrrrrrrrr}-1&-1&-1&-1&-1&-1&0&0&0\\ -1&-1&0&0&+1&+1&-1&-1&0\\ -1&0&-1&+1&0&+1&-1&+1&0\\ \hline\cr\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{9}\\ \end{array}\right.
00+1+1+1+1+1+1+1+11100+1+11+10+11+110118118118118118118118118].\left.\begin{array}[]{rrrrrrrr}0&0&+1&+1&+1&+1&+1&+1\\ +1&+1&-1&-1&0&0&+1&+1\\ -1&+1&0&+1&-1&+1&-1&0\\ \hline\cr\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}&\frac{1}{18}\\ \end{array}\right]. (23)

If the trial at (0,0,0)(0,0,0)^{\prime} were replicated twice, the matrix of the support points would form the transpose of an 18×318\times 3 orthogonal array with 33 levels of strength 22. The 6666 VODs for k=3k=3 can be combined to obtain various exact optimal designs of sizes N=9KN=9K, KK\in\mathbb{N}.

As noted by one reviewer, Model (20) on 𝒴\mathcal{Y} can be linearly reparametrized as a main-effects ANOVA model with kk factors on 33 levels each. Consequently, the DD-optimal designs in these two models should coincide. Indeed, it is possible to verify that the 3-dimensional projections of the widely used Taguchi L18 design for the ANOVA model form (some) DD-optimal designs for Model (20) with k=3k=3 factors. These DD-optimal designs can be obtained as 12νa+12νb\frac{1}{2}\nu_{a}+\frac{1}{2}\nu_{b}, where νa\nu_{a} and νb\nu_{b} are VODs from the two isomorphism classes represented by designs (21) and (22).

For k=4k=4, the size of the maximum optimal support is d=81d=81, the problem rank is s=33s=33, and the dimension of the polytope of optimal designs is t=48t=48. However, for k=4k=4 the enumeration of the VODs was too time-consuming.

4 Conclusions

We brought attention to the fact that the set of all optimal approximate designs is generally a polytope. We have shown that its characteristics can be beneficial to both the theory and applications of experimental design. Particularly noteworthy are the vertices of this polytope – the vertex optimal designs – because of their several unique properties, especially their small supports, and the potential to generate all optimal designs.

Another key point is the application of a rational-arithmetic method that accurately identifies all vertex optimal designs for rational optimal design problems. The use of rational vertex enumeration methods is a new tool of constructing optimal approximate as well as optimal exact designs in an error-free manner, complementing the plethora of analytical techniques in the literature.

Importantly, we have obtained the vertex optimal designs for several commonly used models, thus providing complete information on the optimal approximate designs in these cases. We have demonstrated that in the studied cases, these designs, along with their combinations, correspond to exact optimal designs of various sizes, often related to block designs or orthogonal arrays.

To the best of our knowledge, this paper presents the first systematic approach to studying the set of all optimal designs, and it motivates a variety of intriguing topics for further research. For instance, it is possible to investigate the polytope of optimal designs for important non-rational design problems (including the optimal design problems for the full quadratic model with many factors) and less common statistical models, such as models for paired comparisons (e.g., [25]) or models with restricted design spaces (e.g., [19]). One can also explore extending the results to more specialized optimality criteria, such as those focusing on the estimation of a subset of parameters (see, for instance, Sections 10.2 to 10.5 in Atkinson et al. [2]). In addition, a general avenue of future research is the application of rational-arithmetic calculations in other problems of experimental design.

\bmhead

Acknowledgements We are grateful to Professor Rainer Schwabe from Otto von Guericke University Magdeburg and Associate Professor Martin Máčaj from Comenius University Bratislava for helpful discussions on some aspects of the paper. We would also like to thank the anonymous referees for their insightful comments that helped improve the paper.

Declarations

  • Funding. This work was supported by the Slovak Scientific Grant Agency (VEGA), Grant 1/0362/22.

  • Competing interests. The authors have no relevant financial or non-financial interests to disclose.

  • Ethics approval and consent to participate. Not applicable.

  • Consent for publication. Not applicable.

  • Data availability. Not applicable.

  • Materials availability. Not applicable.

  • Code availability. All codes used for the numerical results are available upon request from the authors.

  • Author contribution. All coauthors contributed to the conceptualization, methodology, mathematical proofs, software, writing and reviewing of the article.

Appendix A Notation

Below is an overview of the notation used throughout the paper and the appendices.

  • 𝟎d\mathbf{0}_{d}\ldots d×1d\times 1 vector of all zeros

  • 𝟏d\mathbf{1}_{d}\ldots d×1d\times 1 vector of all ones

  • a1,,ada_{1},\ldots,a_{d}\ldots components of a vector 𝐚d\mathbf{a}\in\mathbb{R}^{d}

  • 𝐈m\mathbf{I}_{m}\ldots m×mm\times m identity matrix

  • 𝒞(𝐀)\mathcal{C}(\mathbf{A})\ldots column space of the matrix 𝐀\mathbf{A}

  • 𝒩(𝐀)\mathcal{N}(\mathbf{A})\ldots null space of the matrix 𝐀\mathbf{A}

  • tr(𝐀)\mathrm{tr}(\mathbf{A})\ldots trace of the matrix 𝐀\mathbf{A}

  • det(𝐀)\mathrm{det}(\mathbf{A})\ldots determinant of the matrix 𝐀\mathbf{A}

  • vech(𝐀)\mathrm{vech}(\mathbf{A})\ldots half-vectorization the matrix 𝐀\mathbf{A}

  • 𝐚,𝐀\mathbf{a}^{\prime},\mathbf{A}^{\prime}\ldots transpose of 𝐚,𝐀\mathbf{a},\mathbf{A}, respectively

  • Yx=𝐟(𝐱)β+ϵY_{x}=\mathbf{f}^{\prime}(\mathbf{x})\beta+\epsilon\ldots linear model

  • 𝐟m\mathbf{f}\in\mathbb{R}^{m}\ldots regression function

  • mm\ldots number of model parameters in β\beta

  • NN\ldots number of trials

  • 𝐱\mathbf{x}\ldots design point

  • 𝒳\mathcal{X}\ldots set of all design points, design space

  • ξ\xi\ldots (approximate) design

  • 𝒮(ξ)\mathcal{S}(\xi)\ldots support of ξ\xi

  • 𝐌(ξ)\mathbf{M}(\xi)\ldots information matrix of ξ\xi

  • Φp\Phi_{p}\ldots Kiefer’s Φp\Phi_{p}-criterion

  • ξ,ζ\xi^{*},\zeta^{*}\ldots optimal design

  • 𝐌\mathbf{M}_{*}\ldots optimal information matrix

  • 𝒴\mathcal{Y}\ldots maximum optimal support

  • dd\ldots size of 𝒴\mathcal{Y}

  • 𝐲1,,𝐲d\mathbf{y}_{1},\ldots,\mathbf{y}_{d}\ldots elements of 𝒴\mathcal{Y}

  • 𝐟i=𝐟(𝐲i)\mathbf{f}_{i}=\mathbf{f}(\mathbf{y}_{i})

  • ζ\zeta\ldots design supported on 𝒴\mathcal{Y}

  • Υ\Upsilon\ldots set of all designs supported on 𝒴\mathcal{Y}

  • Υ\Upsilon_{*}\ldots set of all optimal designs

  • 𝐰d\mathbf{w}\in\mathbb{R}^{d}\ldots vector of design weights

  • 𝒮(𝐰)\mathcal{S}(\mathbf{w})\ldots support of 𝐰\mathbf{w}

  • \mathbb{P}\ldots simplex of the weights 𝐰\mathbf{w} of all designs on 𝒴\mathcal{Y}

  • \mathbb{P}_{*}\ldots polytope of all optimal weights

  • 𝐰ζ\mathbf{w}_{\zeta}\ldots vector of weights corresponding to ζ\zeta

  • ζ𝐰\zeta_{\mathbf{w}}\ldots design on 𝒴\mathcal{Y} with design weights 𝐰\mathbf{w}

  • 𝐀=[vech(𝐟1𝐟1),,vech(𝐟d𝐟d)]\mathbf{A}=\bigl{[}\mathrm{vech}\left(\mathbf{f}_{1}\mathbf{f}^{\prime}_{1}\right),\ldots,\mathrm{vech}\left(\mathbf{f}_{d}\mathbf{f}^{\prime}_{d}\right)\bigr{]}

  • q=m(m+1)/2q=m(m+1)/2 (number of rows of 𝐀\mathbf{A})

  • s=rank(𝐀)s=\mathrm{rank}(\mathbf{A}) (problem rank)

  • t=dst=d-s (dimension of \mathbb{P}_{*})

  • 𝕋\mathbb{T}_{*}\ldots polytope in t\mathbb{R}^{t} that is isomorphic to \mathbb{P}_{*}

  • \ell\ldots number of vertices of \mathbb{P}_{*}, number of VODs

  • 𝐯1,,𝐯\mathbf{v}_{1},\ldots,\mathbf{v}_{\ell}\ldots vertices of \mathbb{P}_{*}

  • ν1,,ν\nu_{1},\ldots,\nu_{\ell}\ldots vertext optimal designs (VODs)

  • kk\ldots number of factors

  • 𝒱(j)\mathcal{V}(j)\ldots vectors in {0,1}k\{0,1\}^{k} with jj ones

  • 𝒴D\mathcal{Y}_{D}\ldots set 𝒴\mathcal{Y} for DD-optimality

  • 𝒴A\mathcal{Y}_{A}\ldots set 𝒴\mathcal{Y} for AA-optimality

  • r,λ,br,\lambda,b\ldots replication number, concurrence number, number of blocks of a partially balanced design

  • 𝐇m\mathbf{H}_{m}\ldots Hadamard matrix of order mm

Appendix B Proofs

B.1 Proof of Lemma 1.

Proof.

Let 𝐰𝟎d\mathbf{w}\geq\mathbf{0}_{d} satisfy 𝐌(ζ𝐰)=𝐌\mathbf{M}(\zeta_{\mathbf{w}})=\mathbf{M}_{*}. Then tr(𝐌p)=tr(𝐌(ζ𝐰)𝐌p1)=i=1dwi𝐟i𝐌p1𝐟i=tr(𝐌p)(𝟏d𝐰)\mathrm{tr}(\mathbf{M}_{*}^{p})=\mathrm{tr}(\mathbf{M}(\zeta_{\mathbf{w}})\mathbf{M}_{*}^{p-1})=\sum_{i=1}^{d}w_{i}\mathbf{f}^{\prime}_{i}\mathbf{M}_{*}^{p-1}\mathbf{f}_{i}=\mathrm{tr}(\mathbf{M}_{*}^{p})(\mathbf{1}^{\prime}_{d}\mathbf{w}), where the second equality follows form the basic properties of the trace, and the last equality follows from (2), i.e., from the equivalence theorem 𝐟(𝐱)𝐌p1𝐟(𝐱)=tr(𝐌p)\mathbf{f}^{\prime}(\mathbf{x})\mathbf{M}^{p-1}_{*}\mathbf{f}(\mathbf{x})=\mathrm{tr}(\mathbf{M}^{p}_{*}) for all 𝐱𝒳\mathbf{x}\in\mathcal{X}. As tr(𝐌p)>0\mathrm{tr}(\mathbf{M}_{*}^{p})>0, we see that 𝟏d𝐰=1\mathbf{1}^{\prime}_{d}\mathbf{w}=1, i.e., 𝐰\mathbf{w}\in\mathbb{P}. ∎

B.2 Proof of Lemma 2.

Proof.

(a): Because 𝐀\mathbf{A} is a q×dq\times d matrix, the second inequality is clear. To prove the first inequality, observe that since 𝐌\mathbf{M}_{*} is nonsingular, 𝐟1,,𝐟d\mathbf{f}_{1},\ldots,\mathbf{f}_{d} span m\mathbb{R}^{m}; i.e., there exist mm linearly independent 𝐟i\mathbf{f}_{i}’s. Without loss of generality assume that 𝐟1,,𝐟m\mathbf{f}_{1},\ldots,\mathbf{f}_{m} are linearly independent. We shall prove that the matrices 𝐟1𝐟1,,𝐟m𝐟m\mathbf{f}_{1}\mathbf{f}_{1}^{\prime},\ldots,\mathbf{f}_{m}\mathbf{f}_{m}^{\prime} are then also linearly independent (in the space of all m×mm\times m matrices):

Let iαi𝐟i𝐟i=𝟎\sum_{i}\alpha_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime}=\mathbf{0} for some α1,,αm\alpha_{1},\ldots,\alpha_{m}. This can be expressed as 𝐁𝐃𝐁=𝟎\mathbf{B}\mathbf{D}\mathbf{B}^{\prime}=\mathbf{0}, where 𝐁=[𝐟1,,𝐟m]\mathbf{B}=[\mathbf{f}_{1},\ldots,\mathbf{f}_{m}], and 𝐃\mathbf{D} is the diagonal matrix with α1,,αm\alpha_{1},\ldots,\alpha_{m} on the diagonal.

Because 𝐟1,,𝐟m\mathbf{f}_{1},\ldots,\mathbf{f}_{m} are linearly independent, 𝐁\mathbf{B} is an m×mm\times m non-singular matrix. Therefore rank(𝐃)=rank(𝟎)=0\mathrm{rank}(\mathbf{D})=\mathrm{rank}(\mathbf{0})=0, where we used the fact that the multiplication by a non-singular matrix preserves rank. It follows that 𝐃=𝟎\mathbf{D}=\mathbf{0}; i.e., α1==αm=0\alpha_{1}=\ldots=\alpha_{m}=0, which proves the linear independence of 𝐟1𝐟1,𝐟m𝐟m\mathbf{f}_{1}\mathbf{f}_{1}^{\prime},\ldots\mathbf{f}_{m}\mathbf{f}_{m}^{\prime}.

Since vech\mathrm{vech} is a linear isomorphism, it follows that vech(𝐟1𝐟1),,vech(𝐟m𝐟m)\mathrm{vech}(\mathbf{f}_{1}\mathbf{f}_{1}^{\prime}),\ldots,\mathrm{vech}(\mathbf{f}_{m}\mathbf{f}_{m}^{\prime}) are linearly independent, which implies rank(𝐀)m\mathrm{rank}(\mathbf{A})\geq m.

(b): First note that (b) trivially holds when d=1d=1: then s=1s=1 and the set {𝐌(𝐰):𝐰}\{\mathbf{M}(\mathbf{w}):\mathbf{w}\in\mathbb{P}\} contains just one matrix, so its dimension is 0=s10=s-1. Suppose now that d2d\geq 2. Let 𝐠i:=vech(𝐟i𝐟i)\mathbf{g}_{i}:=\mathrm{vech}(\mathbf{f}_{i}\mathbf{f}_{i}^{\prime}), i=1,,di=1,\ldots,d. The vech\mathrm{vech} function is a linear isomorphism between 𝕊m\mathbb{S}^{m} and q\mathbb{R}^{q}, so the dimension of {𝐌(𝐰):𝐰}\{\mathbf{M}(\mathbf{w}):\mathbf{w}\in\mathbb{P}\} is equal to the dimension of {vech(𝐌(𝐰)):𝐰}\{\mathrm{vech}(\mathbf{M}(\mathbf{w})):\mathbf{w}\in\mathbb{P}\}. The latter set is exactly the convex hull of 𝐠1,,𝐠d\mathbf{g}_{1},\ldots,\mathbf{g}_{d}, whose dimension is equal to the dimension of the affine hull of 𝐠1,,𝐠d\mathbf{g}_{1},\ldots,\mathbf{g}_{d}; i.e., the dimension of 𝕄:={𝐀𝐰:𝟏d𝐰=1}\mathbb{M}:=\{\mathbf{A}\mathbf{w}:\mathbf{1}^{\prime}_{d}\mathbf{w}=1\}. Observe that the linearity of vech gives 𝐀𝐰=vech(iwi𝐟i𝐟i)\mathbf{A}\mathbf{w}=\mathrm{vech}(\sum_{i}w_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime}) and 𝐀𝟏d=vech(i𝐟i𝐟i)\mathbf{A}\mathbf{1}_{d}=\mathrm{vech}(\sum_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime}).

We will show that i𝐟i𝐟iiwi𝐟i𝐟i\sum_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime}\neq\sum_{i}w_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime} for any vector 𝐰d\mathbf{w}\in\mathbb{R}^{d} satisfying 𝟏d𝐰=1\mathbf{1}_{d}^{\prime}\mathbf{w}=1 (even for 𝐰\mathbf{w} with some negative components). If we denote 𝐇:=𝐌p1\mathbf{H}:=\mathbf{M}_{*}^{p-1} and c:=tr(𝐌pc:=\mathrm{tr}(\mathbf{M}_{*}^{p}), the fact that 𝐟(𝐱)𝐌p1(ξ~)𝐟(𝐱)=tr(𝐌p(ξ~))\mathbf{f}^{\prime}(\mathbf{x})\mathbf{M}^{p-1}(\tilde{\xi}^{*})\mathbf{f}(\mathbf{x})=\mathrm{tr}(\mathbf{M}^{p}(\tilde{\xi}^{*})) for all 𝐱𝒮(ξ~)\mathbf{x}\in\mathcal{S}(\tilde{\xi}^{*}) implies 𝐟i𝐇𝐟i=c\mathbf{f}_{i}^{\prime}\mathbf{H}\mathbf{f}_{i}=c for all ii. For contradiction, suppose that there exists 𝐰\mathbf{w} such that 𝟏d𝐰=1\mathbf{1}_{d}^{\prime}\mathbf{w}=1 and i𝐟i𝐟i=iwi𝐟i𝐟i\sum_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime}=\sum_{i}w_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime}. It follows that tr(𝐇i𝐟i𝐟i)=tr(𝐇iwi𝐟i𝐟i)\mathrm{tr}(\mathbf{H}\sum_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime})=\mathrm{tr}(\mathbf{H}\sum_{i}w_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime}). Properties of tr\mathrm{tr} then yield i𝐟i𝐇𝐟i=iwi𝐟i𝐇𝐟i\sum_{i}\mathbf{f}_{i}^{\prime}\mathbf{H}\mathbf{f}_{i}=\sum_{i}w_{i}\mathbf{f}_{i}^{\prime}\mathbf{H}\mathbf{f}_{i}; i.e., cd=ciwi=ccd=c\sum_{i}w_{i}=c. This is, however, a contradiction, because d2d\geq 2.

As i𝐟i𝐟iiwi𝐟i𝐟i\sum_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime}\neq\sum_{i}w_{i}\mathbf{f}_{i}\mathbf{f}_{i}^{\prime} for any 𝐰\mathbf{w} satisfying 𝟏d𝐰=1\mathbf{1}_{d}^{\prime}\mathbf{w}=1, we have 𝐀𝟏d𝐀𝐰\mathbf{A}\mathbf{1}_{d}\neq\mathbf{A}\mathbf{w} for any such 𝐰\mathbf{w}. It follows that 𝕄\mathbb{M} is a strict subset of {𝐀𝐰:𝐰d}\{\mathbf{A}\mathbf{w}:\mathbf{w}\in\mathbb{R}^{d}\}, which is 𝒞(𝐀)\mathcal{C}(\mathbf{A}). Therefore, the dimension of 𝕄\mathbb{M} must be less than the dimension of the column space 𝒞(𝐀)\mathcal{C}(\mathbf{A}), which is ss. However, we can also write 𝒞(𝐀)\mathcal{C}(\mathbf{A}) as {𝐀(𝐰+γ𝟏d):𝟏d𝐰=1,γ}=𝕄+{γ𝐀𝟏d:γ}\{\mathbf{A}(\mathbf{w}+\gamma\mathbf{1}_{d}):\mathbf{1}_{d}^{\prime}\mathbf{w}=1,\gamma\in\mathbb{R}\}=\mathbb{M}+\{\gamma\mathbf{A}\mathbf{1}_{d}:\gamma\in\mathbb{R}\} (in the sense of the usual definition of the sum of two sets), which means that the dimension of 𝕄\mathbb{M} is at least s1s-1. Thus, the dimension of 𝕄\mathbb{M} is exactly s1s-1.


(c): Since the vector of weights 𝐰~\widetilde{\mathbf{w}}^{*} that corresponds to the maximal optimal design satisfies 𝐰~>𝟎d\widetilde{\mathbf{w}}^{*}>\mathbf{0}_{d} and 𝐀𝐰~=vech(𝐌)\mathbf{A}\widetilde{\mathbf{w}}^{*}=\mathrm{vech}(\mathbf{M}_{*}), the dimension of \mathbb{P}_{*} is the dimension tt of the null-space 𝒩(𝐀)d\mathcal{N}(\mathbf{A})\subseteq\mathbb{R}^{d}. From the rank-nullity theorem we therefore have that the dimension of \mathbb{P}_{*} is drank(𝐀)=dsd-\mathrm{rank}(\mathbf{A})=d-s. ∎

B.3 Proof of Theorem 1.

Let 𝐛=vech(𝐌)\mathbf{b}=\mathrm{vech}(\mathbf{M}_{*}). Then, \mathbb{P}_{*} can be expressed as

={𝐰d:𝐀𝐰=𝐛,𝐰𝟎d}.\mathbb{P}_{*}=\{\mathbf{w}\in\mathbb{R}^{d}:\mathbf{A}\mathbf{w}=\mathbf{b},\mathbf{w}\geq\mathbf{0}_{d}\}.

Now some notation used in the theory of linear programming needs to be introduced:

  • A constraint is any row in 𝐀𝐰=𝐛\mathbf{A}\mathbf{w}=\mathbf{b} and any inequality wi0w_{i}\geq 0; i.e., \mathbb{P}_{*} is defined by q+dq+d constraints, where q=m(m+1)/2q=m(m+1)/2 is the number of rows of 𝐀\mathbf{A}.

  • A constraint is active at 𝐯d\mathbf{v}\in\mathbb{R}^{d} if it is satisfied as equality (that is, for 𝐯\mathbf{v}\in\mathbb{P}_{*} the active constraints are all the constraints in 𝐀𝐰=𝐛\mathbf{A}\mathbf{w}=\mathbf{b}, and all those constraints wi0w_{i}\geq 0, for which vi=0v_{i}=0).

  • The vector 𝐯d\mathbf{v}\in\mathbb{R}^{d} is a basic solution if 𝐀𝐯=𝐛\mathbf{A}\mathbf{v}=\mathbf{b}, and there are dd linearly independent constraints among all the constraints that are active at 𝐯\mathbf{v}.

  • 𝐯\mathbf{v} is a basic feasible solution if it is a basic solution and 𝐯\mathbf{v}\in\mathbb{P}_{*} (i.e., if it also satisfies 𝐯𝟎d\mathbf{v}\geq\mathbf{0}_{d}).

Proof.

Theorem 2.3 by Bertsimas, Tsitsiklis [6] directly yields the equivalence of 1. and 2. That theorem also states that these are equivalent with 𝐯\mathbf{v} being a basic feasible solution, and we shall show that this condition is equivalent to 4. and 5.

Let 𝐯\mathbf{v} be a basic feasible solution, which means that there are dd linearly independent active constraints at 𝐯\mathbf{v}. Let I0I_{0} be the set of indices such that vi=0v_{i}=0 for iI0i\in I_{0}. Then Theorem 2.2 by Bertsimas, Tsitsiklis [6] says that 𝐯\mathbf{v} is the only vector that satisfies 𝐀𝐰=𝐛\mathbf{A}\mathbf{w}=\mathbf{b} and wi=0w_{i}=0 for iI0i\in I_{0}. Now let 𝐳\mathbf{z}\in\mathbb{P}_{*} satisfy 𝒮(𝐳)𝒮(𝐯)\mathcal{S}(\mathbf{z})\subseteq\mathcal{S}(\mathbf{v}), that is zi=0z_{i}=0 for iI0i\in I_{0}. It follows that 𝐳=𝐯\mathbf{z}=\mathbf{v} because the solution to 𝐀𝐰=𝐛\mathbf{A}\mathbf{w}=\mathbf{b} and wi=0w_{i}=0 for iI0i\in I_{0} was unique; thus showing 4. Note that this also proves that if 𝐯\mathbf{v} is a basic feasible solution, then it is the unique design supported on 𝒮(𝐯)\mathcal{S}(\mathbf{v}), i.e., it proves 5.

Conversely, let 𝐯\mathbf{v} satisfy 4. and again denote I0I_{0} as the set of indices for which vi=0v_{i}=0. Now suppose that 𝐯\mathbf{v} is not a basic solution, which means (in light of Theorem 2.2 by Bertsimas, Tsitsiklis [6]) that there exists 𝐳𝐯\mathbf{z}\neq\mathbf{v} such that 𝐀𝐳=𝐛\mathbf{A}\mathbf{z}=\mathbf{b} and zi=0z_{i}=0 for iI0i\in I_{0} (note that zj0z_{j}\geq 0 need not be satisfied outside I0I_{0}). However, any point 𝐰~\widetilde{\mathbf{w}} that lies on the line going through 𝐳\mathbf{z} and 𝐯\mathbf{v} also satisfies 𝐀𝐰~=𝐛\mathbf{A}\widetilde{\mathbf{w}}=\mathbf{b}, w~i=0\widetilde{w}_{i}=0 for iI0i\in I_{0}. Now take 𝐰~\widetilde{\mathbf{w}} as the point on this line that satisfies w~i0\widetilde{w}_{i}\geq 0 for iI0i\not\in I_{0} and such that w~j=0\widetilde{w}_{j}=0 for some jI0j\not\in I_{0}. Such 𝐰~\widetilde{\mathbf{w}} can be constructed as the intersection of the 𝐳𝐯\mathbf{z}\mathbf{v} line and the relative boundary of \mathbb{P}_{*}, and it exists because 𝐯\mathbf{v}\in\mathbb{P}_{*} and vi>0v_{i}>0 for iI0i\not\in I_{0}. This 𝐰~\widetilde{\mathbf{w}}\in\mathbb{P}_{*} satisfies 𝒮(𝐰~)𝒮(𝐯)\mathcal{S}(\widetilde{\mathbf{w}})\subset\mathcal{S}(\mathbf{v}), which is in contradiction with 4. Note that if we assume 5. instead of 4., then 𝐰~\widetilde{\mathbf{w}}\in\mathbb{P}_{*} constructed above is also in contradiction with 5. because it satisfies 𝒮(𝐰~)𝒮(𝐯)\mathcal{S}(\widetilde{\mathbf{w}})\subseteq\mathcal{S}(\mathbf{v}), but 𝐰~𝐯\widetilde{\mathbf{w}}\neq\mathbf{v}. This concludes the proof that 𝐯\mathbf{v} is a basic feasible solution if and only if it satisfies either 4. or 5. Now Theorem 2.3 by Bertsimas, Tsitsiklis [6] gives the equivalence of 1., 4. and 5.

If rank(𝐀)=q\mathrm{rank}(\mathbf{A})=q, then Theorem 2.4 by Bertsimas, Tsitsiklis [6] yields that 𝐯\mathbf{v} is a basic feasible solution if and only if it satisfies 3. However, the proof of that theorem can be easily extended to the case of 𝐀\mathbf{A} of a general rank sqs\leq q. It follows that 3. is equivalent to 4. and 5. ∎

Appendix C Ad Example 4

In Example 4, the full form of 𝐀𝐰=vech(𝐈4)\mathbf{A}\mathbf{w}=\mathrm{vech}(\mathbf{I}_{4}) is

[+1+1+1+1+1+1+1+11111+1+1+1+1+1+1+1+1+1+1+1+111+1+111+1+1+1+11111+1+1+1+1+1+1+1+1+1+11+11+11+11+1+11+111+11+1+111+1+111+1+1+1+1+1+1+1+1+1]𝐰=[1010010001].\begin{bmatrix}+1&+1&+1&+1&+1&+1&+1&+1\\ -1&-1&-1&-1&+1&+1&+1&+1\\ +1&+1&+1&+1&+1&+1&+1&+1\\ -1&-1&+1&+1&-1&-1&+1&+1\\ +1&+1&-1&-1&-1&-1&+1&+1\\ +1&+1&+1&+1&+1&+1&+1&+1\\ -1&+1&-1&+1&-1&+1&-1&+1\\ +1&-1&+1&-1&-1&+1&-1&+1\\ +1&-1&-1&+1&+1&-1&-1&+1\\ +1&+1&+1&+1&+1&+1&+1&+1\end{bmatrix}\mathbf{w}=\begin{bmatrix}1\\ 0\\ 1\\ 0\\ 0\\ 1\\ 0\\ 0\\ 0\\ 1\end{bmatrix}. (24)

Appendix D Vertex bounds

If ν1,ν2\nu_{1},\nu_{2} are two distinct vertex optimal designs (VODs), it follows that neither set 𝒮(ν1)\mathcal{S}(\nu_{1}) nor 𝒮(ν2)\mathcal{S}(\nu_{2}) can be a subset of the other, which follows from the part 151\Leftrightarrow 5 of Theorem 4. That is, the system of the supports of the VODs forms a Sperner family; this provides a simple upper limit on the number of the VODs via the Sperner’s theorem in terms of the simplest characteristic dd: (dd/2)\ell\leq\binom{d}{\lfloor d/2\rfloor}. If we also know the problem rank ss or the dimension tt of Υ\Upsilon_{*}, we can improve the Sperner’s bound as follows: since each ss-touple of independent elementary information matrices “supports” at most one VOD (see 131\Leftrightarrow 3 in Theorem 2), we have (ds)=(dt)\ell\leq\binom{d}{s}=\binom{d}{t}. For many situations, we can further improve the bound by using the H-representation of the full-dimensional isomporphic polytope 𝕋\mathbb{T}_{*} and the McMullen’s Upper Bound Theorem (see, e.g., the introduction in Avis and Devroye [3]), which gives

(dt+12s)+(dt+22s).\ell\leq\binom{d-\lfloor\frac{t+1}{2}\rfloor}{s}+\binom{d-\lfloor\frac{t+2}{2}\rfloor}{s}.

On the other hand, since the support of each VOD must be at most ss, and any maximal optimal design must be a convex combination of some VODs, we must have d/s\ell\geq\lceil d/s\rceil. Also, since the dimension of Υ\Upsilon_{*} is tt and the VODs generate Υ\Upsilon_{*}, we must have t+1\ell\geq t+1.

Specifically, for t=0t=0 (t=1t=1, t=2t=2) we have =1\ell=1 (=2\ell=2, {3,,d}\ell\in\{3,\ldots,d\}).

References

  • Ahipasaoglu [2021] Ahipasaoglu SD (2021). A branch-and-bound algorithm for the exact optimal experimental design problem. Statistics and Computing 31, 1–11.
  • Atkinson et al. [2007] Atkinson AC, Donev A, Tobias R (2007). Optimum experimental designs, with SAS. Oxford University Press.
  • Avis and Devroye [2000] Avis D, Devroye L (2000). Estimating the number of vertices of a polyhedron. Information processing letters 73(3-4), 137–143.
  • Avis and Fukuda [1996] Avis D, Fukuda K (1996). Reverse search for enumeration. Discrete applied mathematics 65(1-3), 21–46.
  • Avis and Jordan [2015] Avis D, Jordan C (2015). Comparative computational results for some vertex and facet enumeration codes. arXiv preprint arXiv:1510.02545.
  • Bertsimas, Tsitsiklis [1997] Bertsimas D, Tsitsiklis JN (1997). Introduction to Linear Optimization. Athena Scientific.
  • Chang and Lin [2007] Chang FC, Lin HM (2007). On minimally-supported D-optimal designs for polynomial regression with log-concave weight function. Metrika 65, 227–233.
  • Cheng [1987] Cheng CS (1987). An Application of the Kiefer-Wolfowitz Equivalence Theorem to a Problem in Hadamard Transform Optics. The Annals of Statistics 15, 1593–1603.
  • Chernoff [1953] Chernoff H (1953). Locally optimal designs for estimating parameters. Ann. Math. Statist. 24, 586–602.
  • Coetzer and Haines [2017] Coetzer R, Haines LM (2017). The construction of D-and I-optimal designs for mixture experiments with linear constraints on the components. Chemometrics and Intelligent Laboratory Systems 171, 112–124.
  • Dantzig [1951] Dantzig GB (1951). Maximization of a linear function of variables subject to linear inequalities. Activity analysis of production and allocation 13, 339–347.
  • Dette and Studden [1997] Dette H, Studden WJ (1997). The theory of canonical moments with applications in statistics, probability, and analysis. John Wiley & Sons.
  • Dean et al. [2015] Dean AM, Morris M, Stufken J, Bingham D (Eds.) (2015). Handbook of design and analysis of experiments. Chapman & Hall/CRC.
  • Eberly [2021] Eberly D (2021). Robust and error-free geometric computing. CRC Press.
  • Elfving [1952] Elfving G (1952). Optimum allocation in linear regression theory. Annals of Mathematical Statistics 23, 255–262.
  • Farrell et al. [1967] Farrell RH, Kiefer J, Walbran A (1967). Optimum multivariate designs. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 113-138). Berkeley, CA. University of California Press.
  • Fedorov [1972] Fedorov VV (1972). Theory of optimal experiments. Academic Press.
  • Filová and Harman [2020] Filová L, Harman R (2020). Ascent with quadratic assistance for the construction of exact experimental designs. Computational Statistics 35, 775–-801.
  • Freise et al. [2020] Freise F, Holling H, Schwabe R (2020). Optimal designs for two-level main effects models on a restricted design region. Journal of Statistical Planning and Inference 204, 45–54.
  • Fukuda [1997] Fukuda K (1997). cdd/cdd+ Reference Manual. Institute for Operations Research. ETH-Zentrum, 91–111.
  • Fukuda [2020] Fukuda K (2020). Polyhedral computation. Department of Mathematics and Institute of Theoretical Computer Science, ETH Zurich. https://doi.org/10.3929/ethz-b-000426218
  • Gaffke et al. [2014] Gaffke N, Graßhoff U, Schwabe R (2014). Algorithms for approximate linear regression design applied to a first order model with heteroscedasticity. Computational Statistics and Data Analysis 71, 1113–1123.
  • Geyer and Meeden [2023] Geyer CJ, Meeden GD (2023). R package rcdd (C Double Description for R), version 1.5. https://CRAN.R-project.org/package=rcdd
  • Ghosh et al. [2008] Ghosh A, Boyd S, Saberi A (2008). Minimizing Effective Resistance of a Graph. SIAM Review 50, 37–66.
  • Graßhoff et al. [2004] Graßhoff U, Großmann H, Holling H, Schwabe R (2004). Optimal designs for main effects in linear paired comparison models. Journal of Statistical Planning and Inference 126, 361–376.
  • Harman [2008] Harman R (2008). Equivalence theorem for Schur optimality of experimental designs. Journal of Statistical Planning and Inference 138, 1201–1209.
  • Harman et al. [2020] Harman R, Filová L, Richtárik P (2020). A Randomized Exchange Algorithm for Computing Optimal Approximate Designs of Experiments. Journal of the American Statistical Association 115, 348–361.
  • He [2010] He X (2010). Laplacian regularized D-optimal design for active learning and its application to image retrieval. IEEE Transactions on Image Processing 19, 254–263.
  • He [2017] He X (2017). Interleaved lattice-based minimax distance designs. Biometrika 104(3), 713–725.
  • Hedayat et al. [1999] Hedayat AS, Sloan NJA, Stufken J (1999). Orthogonal Arrays: Theory and Applications. Springer.
  • Heiligers [1992] Heiligers B (1992). Admissible experimental designs in multiple polynomial regression. Journal of Statistical Planning and Inference, 31, 219–233.
  • Khachiyan et al. [2009] Khachiyan L, Boros E, Borys K, Gurvich V, Elbassioni K (2009). Generating all vertices of a polyhedron is hard. In Twentieth Anniversary Volume: (pp. 1–17). Springer.
  • Kiefer [1959] Kiefer J (1959). Optimum experimental designs. Journal of the Royal Statistical Society: Series B 21, 272–304.
  • Kushner [1998] Kushner HB (1998). Optimal and efficient repeated-measurements designs for uncorrelated observations. Journal of the American Statistical Association 93(443), 1176–1187.
  • Li and Majumdar [2008] Li G, Majumdar D (2008). D-optimal designs for logistic models with three and four parameters. Journal of Statistical Planning and Inference 138(7), 1950–1959.
  • Lim and Studden [1988] Lim YB, Studden WJ (1988). Efficient Ds-optimal designs for Multivariate Polynomial Regression on the q-Cube. The Annals of Statistics 16(3), 1225–1240.
  • Lucas et al. [2023] Lucas A, Scholz I, Boehme R, Jasson S, Maechler M (2023). R package gmp (Multiple Precision Arithmetic), version 0.7-2 https://CRAN.R-project.org/package=gmp
  • Matheiss and Rubin [1980] Matheiss TH, Rubin DS (1980). A survey and comparison of methods for finding all vertices of convex polyhedral sets. Mathematics of operations research 5(2), 167–185.
  • Motzkin et al. [1953] Motzkin TS, Raiffa H, Thompson GL, Thrall RM (1953). The double description method. Contributions to the Theory of Games 2(28), 51–73.
  • Nijenhuis and Wilf [1972] Nijenhuis A, Wilf HS (1972). Representations of integers by linear forms in nonnegative integers. Journal of Number Theory 4, 98–106.
  • Pázman [1986] Pázman A (1986). Foundation of Optimum Experimental Design. Reidel.
  • Pesotchinsky [1975] Pesotchinsky LL (1975). D-optimum and quasi-D-optimum second-order designs on a cube. Biometrika 62(2), 335–340.
  • Pesotchinsky [1978] Pesotchinsky L (1978). Φp\Phi_{p}-optimal second order designs for symmetric regions. Journal of Statistical Planning and Inference 2(2), 173–188.
  • Pukelsheim [1993] Pukelsheim F (1993). Optimal Design of Experiments. Wiley.
  • Pukelsheim and Rieder [1992] Pukelsheim F, Rieder S (1992). Efficient rounding of approximate designs. Biometrika 79, 763–770.
  • Robinson and Anderson-Cook [2010] Robinson TJ, Anderson-Cook CM (2010). A closer look at D-optimality for screening designs. Quality Engineering 23(1), 1–14.
  • Rockafellar [1996] Rockafellar T (1996). Convex Analysis: Princeton Landmarks in Mathematics and Physics, 18, Princeton University Press
  • Rosa and Harman [2016] Rosa S, Harman R (2016). Optimal approximate designs for estimating treatment contrasts resistant to nuisance effects. Statistical Papers 57, 1077–1106.
  • Sagnol and Harman [2015] Sagnol G, Harman R (2015). Computing exact D-optimal designs by mixed integer second-order cone programming. The Annals of Statistics 43, 2198–-2224.
  • Schwabe and Wierich [1995] Schwabe R, Wierich W (1995). D-optimal designs of experiments with non-interacting factors. Journal of Statistical Planning and Inference 44, 371–384.
  • Schwabe and Wong [2003] Schwabe R, Wong WK (2003). Efficient Product Designs for Quadratic Models on the Hypercube. Sankhyā: The Indian Journal of Statistics 65(3), 649–659.
  • Snee [1979] Snee RD (1979). Experimental designs for mixture systems with multicomponent constraints. Communications in Statistics-Theory and Methods 8(4), 303–326.
  • Todd [2016] Todd MJ (2016). Minimum-Volume Ellipsoids: Theory and Algorithms. SIAM.
  • Wilson [1975] Wilson RM (1975). An existence theory for Pairwise Balanced Designs, III: Proof of the existence conjectures. Journal of Combinatorial Theory 18, 71–79.
  • Ziegler [2012] Ziegler GM (2012). Lectures on polytopes (Vol. 152). Springer.