This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A Unified Framework for Adjustable Robust Optimization with Endogenous Uncertainty

Qi Zhang Corresponding author ([email protected]) Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, MN 55455, USA Wei Feng State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China
Abstract

This work proposes a framework for multistage adjustable robust optimization that unifies the treatment of three different types of endogenous uncertainty, where decisions, respectively, (i) alter the uncertainty set, (ii) affect the materialization of uncertain parameters, and (iii) determine the time when the true values of uncertain parameters are observed. We provide a systematic analysis of the different types of endogenous uncertainty and highlight the connection between optimization under endogenous uncertainty and active learning. We consider decision-dependent polyhedral uncertainty sets and propose a decision rule approach that incorporates both continuous and binary recourse, including recourse decisions that affect the uncertainty set. The proposed method enables the modeling of decision-dependent nonanticipativity and results in a tractable reformulation of the problem. We demonstrate the effectiveness of the approach in computational experiments that cover a range of applications, including plant redesign, maintenance planning with inspections, optimizing revision points in capacity planning, and production scheduling with active parameter estimation. The results show significant benefits from the proper modeling of endogenous uncertainty and active learning.

Keywords: adjustable robust optimization, endogenous uncertainty, active learning

1 Introduction

Robust optimization is an approach to decision making under uncertainty that has gained tremendous popularity in the optimization community over the last two decades. It is particularly attractive in situations in which (i) the set of possible uncertainty realizations, commonly referred to as the uncertainty set, is known but the probability distribution is not, (ii) feasibility over the entire uncertainty set or the worst-case performance is of main interest, or (iii) alternative approaches, such as scenario-based stochastic programming, are computationally significantly less tractable. Earlier works on robust optimization 1, 2 were limited to the static case, which only considers here-and-now decisions that need to be determined before the uncertainty is realized. In contrast, adjustable robust optimization (ARO) 3 provides a framework that also incorporates wait-and-see (or recourse) decisions, which can be determined after the true values of some uncertain parameters are revealed. This makes ARO suitable for a large set of sequential decision-making problems, hence greatly expanding the applicability of the robust optimization paradigm. There has been a plethora of theoretical and practical contributions addressing a variety of ARO problems. A particular research focus has been the development of tractable approximations, e.g. in the form of decision rules 3, 4, and solution methods 5, 6, with many of the more recent efforts directed at the efficient handling of discrete recourse decisions 7, 8, 9, 10, 11, 12. We refer the reader to Yanıkoğlu et al. 13 for a comprehensive survey on ARO.

The term “adjustable robust optimization” had not appeared in the process systems engineering (PSE) literature until 2016 14, 15, 16. Only then, as formally discussed by Zhang et al. 17, the PSE community realized that it actually has been applying the concept of (two-stage) ARO for decades, but under a different name: flexibility analysis 18. Many of the approaches developed in this line of work resemble the ones presented in the robust optimization literature. However, traditional flexibility analysis focuses primarily on nonlinear problems, which may be the reason why it so far has not been recognized or noticed by the operations research community. In the last few years, the number of ARO-related works in PSE has increased rapidly, addressing diverse applications in process design 19, 20, planning and scheduling 14, 15, 16, 21, 22, 23, 24, model predictive control 25, 26, 27, supply chain optimization 28, 29, etc.

In stochastic optimization, one distinguishes between exogenous and endogenous uncertainty. While exogenous uncertainty is not affected by the decision maker’s actions, certain properties of endogenous uncertain parameters are decision-dependent. Endogenous uncertainty has its origin in stochastic programming 30, and mainly two types of endogenous uncertainty have been considered in the literature: type 1 where decisions alter the underlying probability distribution of an uncertain parameter 31, 32, 33, and type 2 where decisions affect the time at which an uncertain parameter materializes or its true value is revealed. The vast majority of existing works address type-2 endogenous uncertainty 34, 35, 36, 37, 38, 39, 40, 41, 42, 43. Here, the main challenge is the modeling of decision-dependent nonanticipativity, which is typically achieved by encoding a conditional scenario tree. The size of this scenario tree grows exponentially with the number of uncertain parameters and decisions affecting the uncertainty, which leads to dramatically increased computational complexity compared to the case with only exogenous uncertainty. Hence, most research efforts have focused on the development of more efficient formulations 35, 36, 38, 42, 43 and solution methods 35, 39, 40, 37, 41, 43. We refer to Apap and Grossmann 43 for a comprehensive review of stochastic programming with endogenous uncertainty.

Only recently, endogenous uncertainty has also been considered in robust optimization. Poss 44 and Nohadani and Sharma 45 address static robust optimization with decision-dependent polyhedral uncertainty sets. A similar approach is taken by Lappas and Gounaris 46, who further extend their framework to consider multistage robust process scheduling where the materialization of task-specific uncertainties, such as processing time and production yield, depends on the execution of the task 14, 22; however, the uncertainty set is only affected by first-stage decisions, and all recourse variables are continuous. Avraamidou and Pistikopoulos 47 apply multiparametric programming to address endogenous uncertainty in two-stage robust optimization. Finally, Feng et al. 48 consider the multistage case with both continuous and binary recourse as well as uncertainty sets that can be affected by decisions at every stage, for which a decision rule approach is applied to derive tractable approximations.

The above-mentioned references investigate problems in which decisions directly affect the stochastic nature of uncertain parameters, which in the robust optimization context translates into uncertainty sets whose shape, size, or dimensionality may be decision-dependent. Also, although the materialization of uncertain parameters can be decision-dependent, it is assumed that the true values of uncertain parameters are observed immediately after their materialization. In a series of recent works, Vayanos and coworkers 49, 50, 51 consider robust optimization with decision-dependent information discovery 38, where decisions determine whether and when uncertain parameters are observed. This constitutes a conceptually different type of endogenous uncertainty, where, instead of an underlying stochastic process, the source of uncertainty is the lack of information, which can be acquired with additional effort. In this case, the uncertainty set is reduced (“shrinks”) with additional observations. The proposed framework provides a means of modeling active learning, which refers to selecting experiments sequentially based on the outcomes of previous experiments. It is hence a promising approach to optimizing the trade-off between exploration and exploitation in sequential decision-making problems, particularly those involving complex constraints. Existing works have applied this approach to the learning of customer behavior through dynamic pricing 49 and active preference elicitation with application to housing allocation 51.

We have come to realize that many problems in PSE require the simultaneous consideration of multiple types of endogenous uncertainty. Examples include strategic capacity planning, design of sensor networks, integrated online process optimization and parameter estimation, and maintenance planning with inspections. To address these problems, we need an approach that can consider all previously described types of endogenous uncertainty simultaneously, which, to the best of our knowledge, does not yet exist in the literature. In this work, we develop such a unified ARO framework, which significantly expands our capability to model data-driven optimization problems under uncertainty, particularly those involving active learning. The main contributions of this work are as follows:

  • We introduce a new refined classification of endogenous uncertainty that more comprehensively reflects the current understanding of the subject. We further provide a unifying robust optimization perspective and highlight the connection between endogenous uncertainty and active learning.

  • We present two-stage and multistage robust optimization formulations and show that they can be used to simultaneously consider all defined types of endogenous uncertainty subject to polyhedral uncertainty sets.

  • A decision rule approach based on previous work 48 is employed to develop tractable approximations of the given problems, which involve both continuous and binary recourse variables. One major novelty of this work is the use of auxiliary uncertain parameters to model decision-dependent nonanticipativity in the multistage setting. We derive a reformulation that can be directly solved using off-the-shelf optimization solvers.

  • The versatility and efficacy of the proposed modeling framework are demonstrated in several computational case studies, which include plant redesign, maintenance planning with inspections, optimizing revision points in capacity planning, and production scheduling with active parameter estimation.

The remainder of this paper is organized as follows. In Section 2, we propose a new refined classification of endogenous uncertainty and systematically analyze the different types of endogenous uncertainty from a robust optimization perspective. In Section 3, the connection between endogenous uncertainty and active learning is established. We develop the two-stage formulation incorporating all types of endogenous uncertainty in Section 4 before presenting the multistage formulation in Section 5. The proposed decision rule approach is detailed in Section 6. In Section 7, results from the computational studies are presented. Finally, in Section 8, we close with some concluding remarks.

Notation

We use lowercase and uppercase boldface letters to denote vectors and matrices, respectively, e.g. 𝒙n\bm{x}\in\mathbb{R}^{n} and 𝑨n×m\bm{A}\in\mathbb{R}^{n\times m}. Scalar quantities are denoted by non-boldface letters. A vector 𝒙\bm{x}’s iith element is denoted by xix_{i}. We use \circ and 𝟙()\mathbbm{1}(\cdot) to denote the Hadamard multiplication operator and the indicator function, respectively. Furthermore, 𝟎\bm{0}, 𝒆\bm{e}, and 𝑬\bm{E} are the zero vector, all-ones vector, and all-ones matrix, respectively, while 𝒆i\bm{e}_{i} is the standard basis vector whose iith element is 1; their dimensions can be inferred from the context.

2 Endogenous Uncertainty Through the Robust Optimization Lens

As mentioned in Section 1, the literature distinguishes between type-1 and type-2 endogenous uncertainty. Here, we further refine the type-2 classification and propose the following definition of three different types of endogenous uncertainty:

  • Type 1: Decisions alter the probability distribution of uncertain parameters.

  • Type 2a: Decisions determine whether and when uncertain parameters materialize, i.e. actually become physically meaningful.

  • Type 2b: Decisions determine whether and when the true values of uncertain parameters are observed.

Endogenous uncertain parameters can be of more than one type. For example, a company may alter the probability distribution of the demand for its product by changing the selling price, in which case the product demand is a type-1 endogenous uncertain parameter. In addition, if we still have to decide whether or not to launch the product in the first place, the product demand is also type-2a endogenous since it will only materialize if the product is actually available on the market. Moreover, a customer’s product demand is observed when the order is placed; it is type-2b endogenous if we can affect the time of the order, e.g. by specifying it in a contract.

In robust optimization, instead of specifying a probability distribution, uncertainty is described using an uncertainty set, which is the set of all realizations of the uncertain parameters considered possible in the problem. In other words, robust optimization only requires the support of the probability distribution. A two-stage robust optimization problem can be generally formulated as follows:

minimize𝒙𝒳,𝒙~\displaystyle\operatorname*{minimize}_{\bm{x}\in\mathcal{X},\,\bm{\tilde{x}}} max𝝃Ξf(𝒙,𝒙~(𝝃),𝝃)\displaystyle\max_{\bm{\xi}\in\Xi}f(\bm{x},\bm{\tilde{x}}(\bm{\xi}),\bm{\xi}) (1)
subjectto\displaystyle\mathrm{subject\;to} 𝒈(𝒙,𝒙~(𝝃),𝝃)𝟎,𝒙~(𝝃)𝒳~𝝃Ξ,\displaystyle\bm{g}(\bm{x},\bm{\tilde{x}}(\bm{\xi}),\bm{\xi})\leq\bm{0},\;\bm{\tilde{x}}(\bm{\xi})\in\widetilde{\mathcal{X}}\quad\forall\,\bm{\xi}\in\Xi,

where 𝒙𝒳\bm{x}\in\mathcal{X} are the first-stage variables and 𝒙~𝒳~\bm{\tilde{x}}\in\widetilde{\mathcal{X}} are the second-stage variables, which are functions of the uncertain parameters 𝝃K\bm{\xi}\in\mathbb{R}^{K}. The objective and constraint functions are denoted by ff and 𝒈\bm{g}, respectively. Problem (1) minimizes the worst-case cost while ensuring that the constraints are satisfied for all 𝝃Ξ\bm{\xi}\in\Xi. By applying an epigraph reformulation, a problem of form (1) can be reformulated into a problem of the following form:

minimize𝒙𝒳,𝒙~\displaystyle\operatorname*{minimize}_{\bm{x}\in\mathcal{X},\,\bm{\tilde{x}}} x1\displaystyle x_{1} (2)
subjectto\displaystyle\mathrm{subject\;to} 𝒈(𝒙,𝒙~(𝝃),𝝃)𝟎,𝒙~(𝝃)𝒳~𝝃Ξ,\displaystyle\bm{g}(\bm{x},\bm{\tilde{x}}(\bm{\xi}),\bm{\xi})\leq\bm{0},\;\bm{\tilde{x}}(\bm{\xi})\in\widetilde{\mathcal{X}}\quad\forall\,\bm{\xi}\in\Xi,

where the objective function is deterministic. Formulation (2) will be used in the remainder of this paper as it is more convenient for our discussion and subsequent reformulations.

  • Remark

    There is a common misconception that a robust optimization problem has to consider a worst-case cost function. As it is apparent from (2), in theory, one can also consider any deterministic or deterministic equivalent (e.g. expressed in terms of scenarios) cost function. Constraint satisfaction with respect to the entire uncertainty set can be considered independently from the choice of objective function. Often, a non-worst-case objective function is more appropriate (see Section 7 for two examples).

In the case of endogenous uncertainty in the two-stage setting, the uncertainty depends on the first-stage decisions 𝒙\bm{x}. For each type of endogenous uncertainty, we show how 𝒙\bm{x} affect the following three major components of problem (2):

  • the uncertainty set Ξ\Xi, which is constant in the case of only exogenous uncertainty but depends on 𝒙\bm{x} if the uncertainty is endogenous, i.e. Ξ=Ξ(𝒙)\Xi=\Xi(\bm{x});

  • the deterministic feasible set (DFS) defined as

    (𝒙,𝝃):={𝒙~𝒳~:𝒈(𝒙,𝒙~,𝝃)𝟎},\mathcal{F}(\bm{x},\bm{\xi}):=\left\{\bm{\tilde{x}}\in\widetilde{\mathcal{X}}:\bm{g}(\bm{x},\bm{\tilde{x}},\bm{\xi})\leq\bm{0}\right\},

    which is the set of feasible second-stage solutions given 𝒙\bm{x} and 𝝃\bm{\xi};

  • and the second-stage recourse decisions 𝒙~(𝝃)\bm{\tilde{x}}(\bm{\xi}) as functions of 𝝃\bm{\xi}.

Type 1 (uncertainty alteration)

The probability distribution of type-1 endogenous uncertain parameters including its support can be altered by decisions. From the robust optimization perspective, this means that the shape and size of the uncertainty set Ξ(𝒙)\Xi(\bm{x}) depend on the first-stage decisions 𝒙\bm{x}. For an example involving two uncertain parameters, ξA\xi_{A} and ξB\xi_{B}, Figure 1a shows the uncertainty sets for two different choices of 𝒙\bm{x}. One results in a polyhedral uncertainty set with four facets while the other results in a polytope with three facets.

Refer to caption

(a) Type 1

Refer to caption

(b) Type 2a

Refer to caption

(c) Type 2b
Figure 1: Decisions affect the uncertainty set differently depending on the type of endogenous uncertainty: (a) type 1: shape and size of the uncertainty set may be altered; (b) type 2a: dimensionality of the uncertainty set changes depending on which uncertain parameters materialize; (c) type 2b: uncertainty set does not change, but its reduction over time does, i.e. it only reduces if realization of uncertainty is observed.
Type 2a (uncertainty materialization)

In the case of type-2a endogenous uncertainty, 𝒙\bm{x} change the uncertainty set by affecting which uncertain parameters actually materialize. In the example shown in Figure 1b, Ξ(𝒙)\Xi(\bm{x}) is the gray shaded polytope if both ξA\xi_{A} and ξB\xi_{B} materialize. However, if 𝒙\bm{x} are chosen such that ξA\xi_{A} does not materialize, Ξ(𝒙)\Xi(\bm{x}) becomes the solid red line segment that is a one-dimensional set for ξB\xi_{B}; hence, 𝒙\bm{x} affect the dimensionality of the uncertainty set. Generally, when some uncertain parameters do not materialize, Ξ(𝒙)\Xi(\bm{x}) becomes the projection of the full-dimensional uncertainty set onto the space of the materialized uncertain parameters.

If an uncertain parameter does not materialize, it is physically meaningless and should therefore be irrelevant for the problem. More precisely, this means that the following DFS dependence condition has to hold: if 𝒙\bm{x} are chosen such that some uncertain parameters do not materialize, then (𝒙,𝝃)\mathcal{F}(\bm{x},\bm{\xi}) has to be independent of those unmaterialized parameters. It is important to note that decision-dependent materialization of parameters is not a feature of uncertainty. It equally occurs in deterministic problems. Hence, the deterministic model, in which 𝝃\bm{\xi} is fixed, should already satisfy the DFS dependence condition, which usually directly extends to the robust formulation.

Finally, recourse decisions cannot depend on unmaterialized parameters. This, however, does not imply that recourse decisions are functions of all materialized parameters. While materialization is a necessary condition for an uncertain parameter to be considered in the recourse function, the sufficient condition is that its true value is observed. This subtle but important distinction is often ignored as almost all existing works on endogenous uncertainty assume that once an uncertain parameter materializes, its true value is automatically observed.

Type 2b (uncertainty observation)

In the case of type-2b endogenous uncertainty, we can decide if and when (in the multistage case) the true values of uncertain parameters are observed. Here, the underlying probability distribution and hence the uncertainty set is unaffected, but the reduction of uncertainty over time depends on when and which observations are made. This is illustrated in Figure 1c, which shows an instance in which the true values of the uncertain parameters ξA\xi_{A} and ξB\xi_{B} are ξ¯A\bar{\xi}_{A} and ξ¯B\bar{\xi}_{B}, respectively. If both parameters are observed, there will be no uncertainty in the second stage as the uncertainty set reduces to the point (ξ¯A,ξ¯B)(\bar{\xi}_{A},\bar{\xi}_{B}). However, if we only observe ξA\xi_{A} and find it to be ξ¯A\bar{\xi}_{A}, the uncertainty with respect to ξB\xi_{B} remains and its reduced uncertainty set becomes the red solid line. Similarly, if we only observe ξB\xi_{B}, the reduced uncertainty set for ξA\xi_{A} becomes the blue solid line. Note that the reduced uncertainty set is conditional, i.e. it depends on the observations made. Clearly, a parameter can only be observed if it materializes. Also, reiterating our previous remark, recourse decisions can only be functions of observed uncertain parameters.

The above discussion references the two-stage robust optimization problem but naturally extends to the multistage case. An overview of the characteristics of the different types of endogenous uncertainty is shown in Table 1.

Table 1: Overview of the impact of decisions on the uncertainty set, the dependence of the DFS on uncertain parameters, and the dependence of the recourse functions on uncertain parameters for the different types of endogenous uncertainty.
Uncertainty Set DFS Dependence Recourse Dependence
Type 1 shape, size unaffected unaffected
Type 2a dimensionality only dependence on materialized parameters no dependence on unmaterialized parameters
Type 2b reduction over time unaffected only dependence on observed parameters

3 Endogenous Uncertainty and Active Learning

In the context of statistical learning, active learning is the concept of actively gathering information for the purpose of statistical inference. It is also referred to as optimal experimental design as the decisions regarding information discovery are made with an objective in mind, such as minimizing the number of data points required to achieve a given level of confidence in the resulting statistical model. In this section, we highlight the connection between active learning and endogenous uncertainty, and show how it can be conceptualized in a set-based robust optimization framework.

The goal of learning is to gain insights into unknown properties, which can be viewed as uncertain parameters whose level of uncertainty we wish to reduce. Hence, learning is a means of uncertainty reduction. It does so by making new observations that condition the probability distribution; it does not alter the underlying distribution, which is an important distinction. With this perspective, we can model active learning using the notion of endogenous uncertainty. Let 𝝃K\bm{\xi}\in\mathbb{R}^{K} be the uncertain parameters of interest and 𝜻M\bm{\zeta}\in\mathbb{R}^{M} be uncertain parameters that can be observed before the realization of 𝝃\bm{\xi}. Before observing any 𝜻\bm{\zeta}, the probability distribution of 𝝃\bm{\xi} is (𝝃)\mathbb{P}(\bm{\xi}). The decision maker can now choose which 𝜻\bm{\zeta} to observe in order to improve the current information state with regard to 𝝃\bm{\xi}, indicated by the binary variables 𝒛{0,1}M\bm{z}\in\{0,1\}^{M} where ziz_{i} equals 1 if ζi\zeta_{i} is observed and 0 otherwise. After observing the chosen parameters, the updated probability distribution becomes (𝝃ζi=ζ^ii for which zi=1)\mathbb{P}(\bm{\xi}\mid\zeta_{i}=\hat{\zeta}_{i}\;\forall\,i\text{ for which }z_{i}=1), where 𝜻^\bm{\hat{\zeta}} denotes a specific observed realization of 𝜻\bm{\zeta}. In a Bayesian sense, (𝝃)\mathbb{P}(\bm{\xi}) and (𝝃|𝜻)\mathbb{P}(\bm{\xi}|\bm{\zeta}) can be interpreted as the prior and posterior distributions, respectively. Clearly, in this setting, 𝜻\bm{\zeta} are type-2b endogenous uncertain parameters. They may also be type-2a endogenous if their materialization is also decision-dependent. Note that 𝝃\bm{\xi} are considered exogenous uncertain parameters if their materialization and observation are not decision-dependent and if (𝝃)\mathbb{P}(\bm{\xi}) and (𝝃|𝜻)\mathbb{P}(\bm{\xi}|\bm{\zeta}) are fixed.

In a robust optimization setting, uncertainty reduction implies reducing the uncertainty set. Using the same notation as above, uncertainty reduction through active learning can then be characterized as follows: Let Ω\Omega be the uncertainty set for the uncertain parameters (𝝃,𝜻)(\bm{\xi},\bm{\zeta}) before the observation of any 𝜻\bm{\zeta}, i.e. (𝝃,𝜻)Ω(\bm{\xi},\bm{\zeta})\in\Omega. Once some 𝜻\bm{\zeta} are observed (indicated by the binary variables 𝒛\bm{z}) with the realization 𝜻^\bm{\hat{\zeta}}, the uncertainty set reduces to Ω¯(𝒛,𝜻^)={(𝝃,𝜻):(𝝃,𝜻)Ξ,𝜻=𝒛𝜻^}\bar{\Omega}(\bm{z},\bm{\hat{\zeta}})=\{(\bm{\xi},\bm{\zeta}):(\bm{\xi},\bm{\zeta})\in\Xi,\;\bm{\zeta}=\bm{z}\circ\bm{\hat{\zeta}}\}. Note that Ω¯(𝒛,𝜻^)Ω\bar{\Omega}(\bm{z},\bm{\hat{\zeta}})\subseteq\Omega for every possible 𝜻^\bm{\hat{\zeta}} as specified by Ω\Omega.

Example 1.

We consider investing in a large-scale industrial manufacturing plant, where the real plant capacity qq is unknown before it is actually built. To reduce the financial risk, we have the option of building a small-scale pilot plant whose achieved capacity pp can help better predict the capacity of the industrial-scale plant. The uncertainty set depicted in Figure 2 shows that pp and qq are highly correlated. If no pilot plant is built, the uncertain parameter pp does not materialize. As a result, we can only rely on our initial belief that the marginal uncertainty set for qq is 𝒬=[qmin,qmax]\mathcal{Q}=[q^{\min},q^{\max}]. However, if we do build a pilot plant and observe a pilot plant capacity of p¯[pmin,pmax]\bar{p}\in[p^{\min},p^{\max}], the uncertainty set for qq reduces to a significantly smaller 𝒬¯(p¯)={q:α+βp¯Δqα+βp¯+Δ}\bar{\mathcal{Q}}(\bar{p})=\{q:\alpha+\beta\bar{p}-\Delta\leq q\leq\alpha+\beta\bar{p}+\Delta\}. Note that in this example, pp is a type-2a and type-2b uncertain parameter since it only materializes (and can then be observed) if we decide to build the pilot plant.

Refer to caption
Figure 2: The uncertainty set for the industrial-scale plant capacity is reduced by observing the capacity of a pilot plant.

In Example 1, we use the term marginal uncertainty set and define it as the projection of the full-dimensional uncertainty set onto the space of a subset of variables, in this case qq. Generally, given uncertain parameters 𝝃\bm{\xi} and 𝜻\bm{\zeta}, observing 𝜻\bm{\zeta} further restricts the marginal uncertainty set for 𝝃\bm{\xi}, denoted by Ξ\Xi, only if the uncertain parameters are correlated. The geometric interpretation is depicted in Figure 3, where an observation 𝜻^\bm{\hat{\zeta}} results in cutting planes (indicated by the red dashed lines) added to Ξ\Xi such that the updated marginal uncertainty set for 𝝃\bm{\xi} becomes Ξ¯(𝜻^)\overline{\Xi}(\bm{\hat{\zeta}}).

Refer to caption
Figure 3: The marginal uncertainty set for 𝝃\bm{\xi} is reduced from Ξ\Xi (shaded rectangle) to Ξ¯(𝜻^)\overline{\Xi}(\bm{\hat{\zeta}}) (pattern-filled polytope) by added cutting planes corresponding to observation 𝜻^\bm{\hat{\zeta}}.

Modeling active learning as a robust optimization problem with endogenous uncertainty provides an opportunity for integrated optimization and learning, which are two tasks typically performed separately. Central to this problem is the trade-off between exploration and exploitation since our actions affect both the amount of new information and the reward that we receive. The proposed framework can readily incorporate this trade-off and is particularly well suited for problems with complex constraints and recourse decisions, which often cause difficulties for many alternative methods, such as Bayesian optimization 52, reinforcement learning 53, and multi-armed bandit optimization 54.

4 Two-Stage Robust Optimization with Endogenous Uncertainty

In the following, we show how type-1, type-2a, and type-2b endogenous uncertainty can be incorporated in a two-stage robust optimization formulation, eventually leading to a unified model that can capture all these types of endogenous uncertainty simultaneously.

4.1 Incorporating Type-1 Endogenous Uncertainty

Consider the following two-stage robust optimization problem:

minimize𝒙,𝒙~\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{\tilde{x}}} x1\displaystyle x_{1} (3)
subjectto\displaystyle\mathrm{subject\;to} 𝒙𝒳\displaystyle\bm{x}\in\mathcal{X}
𝑨(𝝃)𝒙+𝑨~(𝝃)𝒙~(𝝃)𝒃(𝝃)𝒙~(𝝃)𝒳~}𝝃Ξ(𝒙),\displaystyle\left.\begin{array}[]{l}\bm{A}(\bm{\xi})\bm{x}+\bm{\widetilde{A}}(\bm{\xi})\bm{\tilde{x}}(\bm{\xi})\leq\bm{b}(\bm{\xi})\\[4.0pt] \bm{\tilde{x}}(\bm{\xi})\in\widetilde{\mathcal{X}}\end{array}\right\}\quad\forall\,\bm{\xi}\in\Xi(\bm{x}),

where the dependence of the uncertainty set Ξ\Xi on 𝒙\bm{x} indicates type-1 endogenous uncertainty. Throughout this work, we restrict our discussion to decision-dependent uncertainty sets of the following polyhedral form:

Ξ(𝒙)={𝝃K:𝑾𝝃𝑼𝒙},\Xi(\bm{x})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\bm{W}\bm{\xi}\leq\bm{U}\bm{x}\right\}, (4)

where we assume that Ξ(𝒙)\Xi(\bm{x}) is always nonempty and bounded, i.e. Ξ(𝒙){𝝃K:𝝃min𝝃𝝃max}\Xi(\bm{x})\subseteq\{\bm{\xi}\in\mathbb{R}^{K}:\bm{\xi}^{\min}\leq\bm{\xi}\leq\bm{\xi}^{\max}\} for any 𝒙𝒳\bm{x}\in\mathcal{X}.

  • Remark

    Some endogenous uncertainty can be modeled as exogenous uncertainty through a simple transformation. For example, consider a scalar endogenous uncertain parameter ξ\xi with the uncertainty set Ξ(x)={ξ:0ξx}\Xi(x)=\{\xi\in\mathbb{R}:0\leq\xi\leq x\}, which depends on variable xx. In this case, we could define a new uncertain parameter ξ~\tilde{\xi} and add the equation ξ=ξ~x\xi=\tilde{\xi}x to the model. The uncertainty set for ξ~\tilde{\xi} is then Ξ~={ξ~:0ξ~1}\widetilde{\Xi}=\{\tilde{\xi}\in\mathbb{R}:0\leq\tilde{\xi}\leq 1\}, which does not depend on xx, and ξ\xi can now be treated as a variable that is adjustable with respect to ξ~\tilde{\xi}.

4.2 Incorporating Type-2a Endogenous Uncertainty

To model type-2a endogenous uncertainty, we introduce binary variables 𝒚{0,1}K\bm{y}\in\{0,1\}^{K} to encode the materialization of uncertain parameters, i.e. yi=1y_{i}=1 if and only if ξi\xi_{i} materializes. Then, both type-1 endogenous uncertainty with decision-dependent uncertainty set of the form (4) and type-2a endogenous uncertainty can be captured in a two-stage formulation of the following form:

minimize𝒙,𝒚,𝒙~\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{\tilde{x}}} x1\displaystyle x_{1} (5)
subjectto\displaystyle\mathrm{subject\;to} 𝒙𝒳,𝒚𝒴\displaystyle\bm{x}\in\mathcal{X},\;\bm{y}\in\mathcal{Y}
𝑨(𝝃)𝒙+𝑫(𝝃)𝒚+𝑨~(𝝃)𝒙~(𝒚𝝃)𝒃(𝝃)𝒙~(𝒚𝝃)𝒳~}𝝃Ξ(𝒙,𝒚),\displaystyle\left.\begin{array}[]{l}\bm{A}(\bm{\xi})\bm{x}+\bm{D}(\bm{\xi})\bm{y}+\bm{\widetilde{A}}(\bm{\xi})\bm{\tilde{x}}(\bm{y}\circ\bm{\xi})\leq\bm{b}(\bm{\xi})\\[4.0pt] \bm{\tilde{x}}(\bm{y}\circ\bm{\xi})\in\widetilde{\mathcal{X}}\end{array}\right\}\quad\forall\,\bm{\xi}\in\Xi(\bm{x},\bm{y}),

where 𝒴{0,1}K\mathcal{Y}\subseteq\{0,1\}^{K} and

Ξ(𝒙,𝒚)={𝝃K:𝑾𝝃𝑼𝒙+𝑼¯𝒚},\Xi(\bm{x},\bm{y})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\bm{W}\bm{\xi}\leq\bm{U}\bm{x}+\bm{\overline{U}}\bm{y}\right\}, (6)

assuming that Ξ(𝒙,𝒚){𝝃K:𝝃min𝝃𝝃max}\Xi(\bm{x},\bm{y})\subseteq\{\bm{\xi}\in\mathbb{R}^{K}:\bm{\xi}^{\min}\leq\bm{\xi}\leq\bm{\xi}^{\max}\} for any 𝒙𝒳\bm{x}\in\mathcal{X} and 𝒚𝒴\bm{y}\in\mathcal{Y}. Note that formulation (5) is actually of the same form as (3); we have merely augmented the first-stage variables with 𝒚\bm{y} and restricted the recourse functions to only depend on materialized parameters, i.e. 𝒚𝝃\bm{y}\circ\bm{\xi}. Here, the assumption is that all materialized uncertain parameters are also observed.

While the recourse dependence property of type-2a endogenous uncertainty is explicitly stated, it may not be immediately obvious how the change of dimensionality of the uncertainty set and the DFS dependence property can be captured in (5). Here, the uncertainty set Ξ(𝒙,𝒚)\Xi(\bm{x},\bm{y}) has to be constructed such that for a given 𝒚\bm{y}, its projection onto the space of materialized uncertain parameters, i.e. ξi\xi_{i} for which yi=1y_{i}=1, forms the desired uncertainty set for the given 𝒚\bm{y}. In addition, the constraints in (5) have to be constructed such that the feasible region is not affected by any ξiΞi(𝒙,𝒚)\xi_{i}\in\Xi_{i}(\bm{x},\bm{y}) for which yi=0y_{i}=0, where Ξi(𝒙,𝒚)\Xi_{i}(\bm{x},\bm{y}) denotes the projection of Ξ(𝒙,𝒚)\Xi(\bm{x},\bm{y}) onto the ξi\xi_{i}-space. Uncertainty sets and constraints satisfying these requirements can be constructed in various ways. To show the generality of formulation (5), we provide the following result.

Proposition 1.

Formulation (5) is general in a sense that it can model the general case in which every possible set of materialized uncertain parameters results in a distinct set of constraints and a distinct 𝐱\bm{x}-dependent uncertainty set of appropriate dimensionality.

Proof.

We provide a constructive proof for Proposition 1. In the following, we refer to a possible set of materialized uncertain parameters as a materialization scenario. Since 𝒚{0,1}K\bm{y}\in\{0,1\}^{K} is defined such that each materialization scenario is given by a specific 𝒚\bm{y} with yi=1y_{i}=1 if and only if the uncertain parameter ξi\xi_{i} materializes, the set of materialization scenarios is finite (maximum 2K2^{K}). Let SS be the number of materialization scenarios and 𝒮:={1,,S}\mathcal{S}:=\{1,\dots,S\}. We define a set 𝒴:={𝒚¯1,,𝒚¯S}\mathcal{Y}:=\{\bm{\bar{y}}^{1},\dots,\bm{\bar{y}}^{S}\} where 𝒚¯s\bm{\bar{y}}^{s} is the specific 𝒚\bm{y}-vector that represents scenario ss. In the most general case, each materialization scenario is associated with a distinct set of constraints and a distinct uncertainty set involving only the materialized uncertain parameters. Hence, the two-stage problem can be formulated as follows:

minimize𝒙,𝒚,𝒙~\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{\tilde{x}}} x1\displaystyle x_{1} (7)
subjectto\displaystyle\mathrm{subject\;to} 𝒙𝒳,𝒚𝒴\displaystyle\bm{x}\in\mathcal{X},\;\bm{y}\in\mathcal{Y}
s𝒮[𝒚=𝒚¯s𝑨s(𝝃s)𝒙+𝑫s(𝝃s)𝒚¯s+𝑨~s(𝝃s)𝒙~(𝝃s)𝒃s(𝝃s)𝒙~(𝝃s)𝒳~}𝝃sΞs(𝒙)],\displaystyle\bigvee_{s\in\mathcal{S}}\>\left[\begin{array}[]{c}\bm{y}=\bm{\bar{y}}^{s}\\[4.0pt] \left.\begin{array}[]{c}\bm{A}^{s}(\bm{\xi}^{s})\bm{x}+\bm{D}^{s}(\bm{\xi}^{s})\bm{\bar{y}}^{s}+\bm{\widetilde{A}}^{s}(\bm{\xi}^{s})\bm{\tilde{x}}(\bm{\xi}^{s})\leq\bm{b}^{s}(\bm{\xi}^{s})\\[4.0pt] \bm{\tilde{x}}(\bm{\xi}^{s})\in\widetilde{\mathcal{X}}\end{array}\right\}\;\forall\,\bm{\xi}^{s}\in\Xi^{s}(\bm{x})\end{array}\right],

where the disjunction indicates that only the constraints for the selected materialization scenario need to be satisfied. Here, 𝝃s\bm{\xi}^{s} only contains the ξi\xi_{i} for which y¯is=1\bar{y}^{s}_{i}=1 and is therefore a KsK^{s}-dimensional vector with Ks=𝒆𝒚¯sK^{s}=\bm{e}^{\top}\bm{\bar{y}}^{s}. The KsK^{s}-dimensional scenario-specific uncertainty set is

Ξs(𝒙)={𝝃sKs:𝑾s𝝃s𝑼s𝒙}{𝝃sKs:ξiminξiξimaxi for which y¯is=1}.\Xi^{s}(\bm{x})=\left\{\bm{\xi}^{s}\in\mathbb{R}^{K^{s}}:\bm{W}^{s}\bm{\xi}^{s}\leq\bm{U}^{s}\bm{x}\right\}\subseteq\left\{\bm{\xi}^{s}\in\mathbb{R}^{K^{s}}:\xi^{\min}_{i}\leq\xi_{i}\leq\xi^{\max}_{i}\;\;\forall\,i\text{ for which }\bar{y}^{s}_{i}=1\right\}.

Similar to the constraints, a disjunction can also be used to define an overall uncertainty set Ξ(𝒙,𝒚)\Xi(\bm{x},\bm{y}). By doing so, problem (7) can be reformulated as

minimize𝒙,𝒚,𝒙~\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{\tilde{x}}} x1\displaystyle x_{1} (8)
subjectto\displaystyle\mathrm{subject\;to} 𝒙𝒳,𝒚𝒴\displaystyle\bm{x}\in\mathcal{X},\;\bm{y}\in\mathcal{Y}
s𝒮[𝒚=𝒚¯s𝑨s(𝝃s)𝒙+𝑫s(𝝃s)𝒚¯s+𝑨~s(𝝃s)𝒙~(𝒚¯s𝝃)𝒃s(𝝃s)𝒙~(𝒚¯s𝝃)𝒳~]𝝃Ξ(𝒙,𝒚)\displaystyle\bigvee_{s\in\mathcal{S}}\>\left[\begin{array}[]{c}\bm{y}=\bm{\bar{y}}^{s}\\[4.0pt] \bm{A}^{s}(\bm{\xi}^{s})\bm{x}+\bm{D}^{s}(\bm{\xi}^{s})\bm{\bar{y}}^{s}+\bm{\widetilde{A}}^{s}(\bm{\xi}^{s})\bm{\tilde{x}}(\bm{\bar{y}}^{s}\circ\bm{\xi})\leq\bm{b}^{s}(\bm{\xi}^{s})\\[4.0pt] \bm{\tilde{x}}(\bm{\bar{y}}^{s}\circ\bm{\xi})\in\widetilde{\mathcal{X}}\end{array}\right]\quad\forall\,\bm{\xi}\in\Xi(\bm{x},\bm{y})

with

Ξ(𝒙,𝒚)={𝝃K:s𝒮[𝒚=𝒚¯s𝑾s𝝃s𝑼s𝒙ξi=0i for which y¯is=0]}.\Xi(\bm{x},\bm{y})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\bigvee_{s\in\mathcal{S}}\>\left[\begin{array}[]{c}\bm{y}=\bm{\bar{y}}^{s}\\[4.0pt] \bm{W}^{s}\bm{\xi}^{s}\leq\bm{U}^{s}\bm{x}\\[4.0pt] \xi_{i}=0\;\;\forall\,i\text{ for which }\bar{y}^{s}_{i}=0\end{array}\right]\right\}.

Note that in Ξ(𝒙,𝒚)\Xi(\bm{x},\bm{y}), we set unmaterialized uncertain parameters to zero; this is a convenient but arbitrary choice since for a given materialization scenario ss, only the projection of Ξ(𝒙,𝒚)\Xi(\bm{x},\bm{y}) onto the 𝝃s\bm{\xi}^{s}-space matters.

With additional binary variables 𝝎{0,1}S\bm{\omega}\in\{0,1\}^{S}, we can reformulate the disjunctions and arrive at the following formulation:

minimize𝒙,𝝎,𝒚,𝒙~\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{\omega},\,\bm{y},\,\bm{\tilde{x}}} x1\displaystyle x_{1} (9)
subjectto\displaystyle\mathrm{subject\;to} 𝒙𝒳,𝝎{0,1}S,𝒚𝒴\displaystyle\bm{x}\in\mathcal{X},\;\bm{\omega}\in\{0,1\}^{S},\;\bm{y}\in\mathcal{Y}
s𝒮ωs=1,ωs=1𝒚=𝒚¯ss𝒮\displaystyle\sum_{s\in\mathcal{S}}\omega^{s}=1,\quad\omega^{s}=1\Leftrightarrow\bm{y}=\bm{\bar{y}}^{s}\quad\forall\,s\in\mathcal{S}
𝑨s(𝝃s)𝒙+𝑫s(𝝃s)𝒚+𝑨~s(𝝃s)𝒙~(𝒚𝝃)𝒃s(𝝃s)+Ms𝒆(1ωs)s𝒮,𝝃Ξ(𝒙,𝝎,𝒚)\displaystyle\bm{A}^{s}(\bm{\xi}^{s})\bm{x}+\bm{D}^{s}(\bm{\xi}^{s})\bm{y}+\bm{\widetilde{A}}^{s}(\bm{\xi}^{s})\bm{\tilde{x}}(\bm{y}\circ\bm{\xi})\leq\bm{b}^{s}(\bm{\xi}^{s})+M^{s}\bm{e}(1-\omega^{s})\quad\forall\,s\in\mathcal{S},\,\bm{\xi}\in\Xi(\bm{x},\bm{\omega},\bm{y})
𝒙~(𝒚𝝃)𝒳~𝝃Ξ(𝒙,𝝎,𝒚)\displaystyle\bm{\tilde{x}}(\bm{y}\circ\bm{\xi})\in\widetilde{\mathcal{X}}\quad\forall\,\bm{\xi}\in\Xi(\bm{x},\bm{\omega},\bm{y})

with

Ξ(𝒙,𝝎,𝒚)={𝝃K:𝑾s𝝃s𝑼s𝒙+M^s𝒆(1ωs)s𝒮𝝃min𝒚𝝃𝝃max𝒚},\Xi(\bm{x},\bm{\omega},\bm{y})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\begin{array}[]{l}\bm{W}^{s}\bm{\xi}^{s}\leq\bm{U}^{s}\bm{x}+\widehat{M}^{s}\bm{e}(1-\omega^{s})\quad\forall\,s\in\mathcal{S}\\[4.0pt] \bm{\xi}^{\min}\circ\bm{y}\leq\bm{\xi}\leq\bm{\xi}^{\max}\circ\bm{y}\end{array}\right\},

where MsM^{s} and M^s\widehat{M}^{s} are vectors of sufficiently large parameters. As the statement ωs=1𝒚=𝒚¯s\omega^{s}=1\Leftrightarrow\bm{y}=\bm{\bar{y}}^{s} can be expressed as as set of linear constraints, e.g. 𝒚¯sωs𝒚𝒆+(𝒚¯s𝒆)ωs\bm{\bar{y}}^{s}\omega^{s}\leq\bm{y}\leq\bm{e}+(\bm{\bar{y}}^{s}-\bm{e})\omega^{s}, formulation (9) can be easily transformed into the form of (5) by appropriate redefinition of variables, constraint matrices, and right-hand sides. ∎

4.3 Incorporating Type-2b Endogenous Uncertainty

Finally, we introduce additional binary variables 𝒛{0,1}K\bm{z}\in\{0,1\}^{K} to indicate which uncertain parameters are observed, and we arrive at the following unified two-stage robust optimization formulation that considers type-1, type-2a, and type-2b endogenous uncertainty:

minimize𝒙,𝒚,𝒛,𝒙~\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{z},\,\bm{\tilde{x}}} x1\displaystyle x_{1} (10)
subjectto\displaystyle\mathrm{subject\;to} 𝒙𝒳,𝒚𝒴,𝒛𝒵,𝒛𝒚\displaystyle\bm{x}\in\mathcal{X},\;\bm{y}\in\mathcal{Y},\;\bm{z}\in\mathcal{Z},\;\bm{z}\leq\bm{y}
𝑨(𝝃)𝒙+𝑫(𝝃)𝒚+𝑯(𝝃)𝒛+𝑨~(𝝃)𝒙~(𝒛𝝃)𝒃(𝝃)𝒙~(𝒛𝝃)𝒳~}𝝃Ξ(𝒙,𝒚),\displaystyle\left.\begin{array}[]{l}\bm{A}(\bm{\xi})\bm{x}+\bm{D}(\bm{\xi})\bm{y}+\bm{H}(\bm{\xi})\bm{z}+\bm{\widetilde{A}}(\bm{\xi})\bm{\tilde{x}}(\bm{z}\circ\bm{\xi})\leq\bm{b}(\bm{\xi})\\[4.0pt] \bm{\tilde{x}}(\bm{z}\circ\bm{\xi})\in\widetilde{\mathcal{X}}\end{array}\right\}\quad\forall\,\bm{\xi}\in\Xi(\bm{x},\bm{y}),

where 𝒵{0,1}K\mathcal{Z}\subseteq\{0,1\}^{K}. The recourse decisions 𝒙~(𝒛𝝃)\bm{\tilde{x}}(\bm{z}\circ\bm{\xi}) can only be functions of the observed parameters. Also, by requiring 𝒛𝒚\bm{z}\leq\bm{y}, parameters can only be observed if they materialize. The formulation of the uncertainty set does not change from (6):

Ξ(𝒙,𝒚)={𝝃K:𝑾𝝃𝑼𝒙+𝑼¯𝒚}.\Xi(\bm{x},\bm{y})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\bm{W}\bm{\xi}\leq\bm{U}\bm{x}+\bm{\overline{U}}\bm{y}\right\}.

5 Multistage Robust Optimization with Endogenous Uncertainty

In multistage robust optimization with endogenous uncertainty, decisions that affect the uncertainty can also be recourse variables. Moreover, while the two-stage problem can only consider decisions determining if uncertain parameters materialize or are observed, multistage formulations can incorporate decisions affecting when, i.e. in which stage, uncertain parameters materialize or are observed.

Let 𝒯={1,,T}\mathcal{T}=\{1,\dots,T\} denote the set of stages and \mathcal{I} the set of uncertain parameters. An uncertain parameter ii\in\mathcal{I} can materialize anytime between (and including) given stages τi\tau_{i} and τ¯i\bar{\tau}_{i}. We define a vector of uncertain parameters 𝝃tKt\bm{\xi}_{t}\in\mathbb{R}^{K_{t}} that contains all uncertain parameters ii for which τi=t\tau_{i}=t. We further define all uncertain parameters that can materialize in or prior to stage tt as 𝝃[t]=(𝝃1,,𝝃t)\bm{\xi}_{[t]}=(\bm{\xi}_{1},\dots,\bm{\xi}_{t}), and 𝝃=𝝃[T]\bm{\xi}=\bm{\xi}_{[T]}. The set of uncertain parameters for which τi=t\tau_{i}=t is denoted by ~t\widetilde{\mathcal{I}}_{t}, and we have ~t~t=\widetilde{\mathcal{I}}_{t}\cap\widetilde{\mathcal{I}}_{t^{\prime}}=\emptyset for all ttt\neq t^{\prime} and =t𝒯~t\mathcal{I}=\bigcup_{t\in\mathcal{T}}\widetilde{\mathcal{I}}_{t}. The set of uncertain parameters that can materialize in stage tt is denoted by ¯t\bar{\mathcal{I}}_{t}. In the special case where every uncertain parameter can only materialize in one stage, ¯t=~t\bar{\mathcal{I}}_{t}=\widetilde{\mathcal{I}}_{t}. Similarly, let ^t\widehat{\mathcal{I}}_{t} denote the set of uncertain parameters that can be observed in stage tt, and τ^i\hat{\tau}_{i} and τ~i\tilde{\tau}_{i} be the stages between which uncertain parameter ii can be observed. Since the observation of an uncertain parameter can only occur after its materialization, we have τ^iτi\hat{\tau}_{i}\geq\tau_{i} and τ~iτ¯i\tilde{\tau}_{i}\geq\bar{\tau}_{i}.

The general multistage robust optimization problem with type-1, type-2a, and type-2b endogenous uncertainty can be formulated as follows:

minimize𝒙,𝒚,𝒛\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{z}} x11\displaystyle x_{11} (11)
subjectto\displaystyle\begin{array}[]{l}\mathrm{subject\;to}\;\vphantom{\sum\limits_{t^{\prime}=1}^{t}}\\[10.0pt] \vphantom{\bm{x}_{t}(\bm{\xi}_{[t]})\in\mathcal{X}_{t}}\\[4.0pt] \vphantom{\bm{y}_{t}(\bm{\xi}_{[t]})\in\mathcal{Y}_{t}}\\[4.0pt] \vphantom{\bm{z}_{t}(\bm{\xi}_{[t]})\in\mathcal{Z}_{t}}\end{array} t=1t[𝑨tt(𝝃[t])𝒙t+𝑫tt(𝝃[t])𝒚t+𝑯tt(𝝃[t])𝒛t]𝒃t(𝝃[t])𝒙t=𝒙t(𝒛¯t1𝝃[t]),𝒙t𝒳t𝒚t=𝒚t(𝒛¯t1𝝃[t]),𝒚t𝒴t𝒛t=𝒛t(𝒛¯t1𝝃[t]),𝒛t𝒵t}\displaystyle\left.\begin{array}[]{l}\sum\limits_{t^{\prime}=1}^{t}\left[\bm{A}_{tt^{\prime}}(\bm{\xi}_{[t]})\bm{x}_{t^{\prime}}+\bm{D}_{tt^{\prime}}(\bm{\xi}_{[t]})\bm{y}_{t^{\prime}}+\bm{H}_{tt^{\prime}}(\bm{\xi}_{[t]})\bm{z}_{t^{\prime}}\right]\leq\bm{b}_{t}(\bm{\xi}_{[t]})\\[10.0pt] \bm{x}_{t}=\bm{x}_{t}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t]}\right),\;\bm{x}_{t}\in\mathcal{X}_{t}\\[4.0pt] \bm{y}_{t}=\bm{y}_{t}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t]}\right),\;\bm{y}_{t}\in\mathcal{Y}_{t}\\[4.0pt] \bm{z}_{t}=\bm{z}_{t}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t]}\right),\;\bm{z}_{t}\in\mathcal{Z}_{t}\end{array}\right\}
t𝒯,𝝃[t]Ξt(𝒙[t1],𝒚[t1],𝒛[t2])\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\forall\,t\in\mathcal{T},\,\bm{\xi}_{[t]}\in\Xi_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-2]})

where 𝒴t{0,1}|¯t+1|\mathcal{Y}_{t}\in\{0,1\}^{|\bar{\mathcal{I}}_{t+1}|} and 𝒵t{0,1}|^t+1|\mathcal{Z}_{t}\in\{0,1\}^{|\widehat{\mathcal{I}}_{t+1}|}. The notation is such that 𝒙[t]=(𝒙1,,𝒙t)\bm{x}_{[t]}=(\bm{x}_{1},\dots,\bm{x}_{t}), 𝒙=𝒙[T]\bm{x}=\bm{x}_{[T]}, 𝒚[t]=(𝒚1,,𝒚t)\bm{y}_{[t]}=(\bm{y}_{1},\dots,\bm{y}_{t}), 𝒚=𝒚[T]\bm{y}=\bm{y}_{[T]}, 𝒛[t]=(𝒛1,,𝒛t)\bm{z}_{[t]}=(\bm{z}_{1},\dots,\bm{z}_{t}), and 𝒛=𝒛[T]\bm{z}=\bm{z}_{[T]}. The binary variable ytiy_{ti} equals 1 if uncertain parameter ii materializes in stage t+1t+1, whereas the binary variable ztiz_{ti} equals 1 if uncertain parameter ii is observed in stage t+1t+1. We use auxiliary variables 𝒛¯t1=(t=0t1zt1,,t=0t1ztK[t])\bm{\bar{z}}_{t-1}=\left(\sum_{t^{\prime}=0}^{t-1}z_{t^{\prime}1},\dots,\sum_{t^{\prime}=0}^{t-1}z_{t^{\prime}K_{[t]}}\right) with K[t]=t=1tKtK_{[t]}=\sum_{t^{\prime}=1}^{t}K_{t^{\prime}} to indicate which uncertain parameters in 𝝃[t]\bm{\xi}_{[t]} are observed up to stage tt. The recourse variables in stage tt, i.e. 𝒙t\bm{x}_{t}, 𝒚t\bm{y}_{t}, and 𝒛t\bm{z}_{t}, are functions of 𝒛¯t1𝝃[t]\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t]}, which represent the uncertain parameters observed up to stage tt. For notational convenience, we assume that 𝝃1=ξ11=1\bm{\xi}_{1}=\xi_{11}=1, 𝒚0=y01=1\bm{y}_{0}=y_{01}=1, and 𝒛0=z01=1\bm{z}_{0}=z_{01}=1. The stage-specific uncertainty set Ξt\Xi_{t} is defined such that Ξ1={𝝃1:ξ11=1}\Xi_{1}=\{\bm{\xi}_{1}\in\mathbb{R}:\xi_{11}=1\}, and for t2t\geq 2:

Ξt(𝒙[t1],𝒚[t1],𝒛[t2])={𝝃[t]K[t]:𝑾t𝝃[t]t=1t1[𝑼tt𝒙t+𝑼¯tt𝒚t]𝒙t=𝒙t(𝒛¯t1𝝃[t])t=1,,t1𝒚t=𝒚t(𝒛¯t1𝝃[t])t=1,,t1𝒛t=𝒛t(𝒛¯t1𝝃[t])t=1,,t1},\Xi_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-2]})=\left\{\bm{\xi}_{[t]}\in\mathbb{R}^{K_{[t]}}:\begin{array}[]{l}\bm{W}_{t}\bm{\xi}_{[t]}\leq\sum\limits_{t^{\prime}=1}^{t-1}\left[\bm{U}_{tt^{\prime}}\bm{x}_{t^{\prime}}+\bm{\overline{U}}_{tt^{\prime}}\bm{y}_{t^{\prime}}\right]\\[8.0pt] \bm{x}_{t^{\prime}}=\bm{x}_{t^{\prime}}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t^{\prime}]}\right)\quad\forall\,t^{\prime}=1,\dots,t-1\\[4.0pt] \bm{y}_{t^{\prime}}=\bm{y}_{t^{\prime}}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t^{\prime}]}\right)\quad\forall\,t^{\prime}=1,\dots,t-1\\[4.0pt] \bm{z}_{t^{\prime}}=\bm{z}_{t^{\prime}}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t^{\prime}]}\right)\quad\forall\,t^{\prime}=1,\dots,t-1\end{array}\right\}, (12)

where the polyhedral set directly depends on 𝒙[t1]\bm{x}_{[t-1]} and 𝒚[t1]\bm{y}_{[t-1]}, which in turn are functions of observed uncertain parameters given by 𝒛[t2]\bm{z}_{[t-2]}. We assume that Ξt(𝒙[t1],𝒚[t1],𝒛[t2]){𝝃[t]K[t]:𝝃[t]min𝝃[t]𝝃[t]max}\Xi_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-2]})\subseteq\{\bm{\xi}_{[t]}\in\mathbb{R}^{K_{[t]}}:\bm{\xi}^{\min}_{[t]}\leq\bm{\xi}_{[t]}\leq\bm{\xi}^{\max}_{[t]}\} for all feasible (𝒙[t1],𝒚[t1],𝒛[t2])(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-2]}).

The linear constraints in problem (11) include the following inequalities:

t=τiτ¯iyt1,i1i\displaystyle\sum_{t^{\prime}=\tau_{i}}^{\bar{\tau}_{i}}y_{t^{\prime}-1,i}\leq 1\quad\forall\,i\in\mathcal{I} (13a)
t=τ^iτ~izt1,i1i\displaystyle\sum_{t^{\prime}=\hat{\tau}_{i}}^{\tilde{\tau}_{i}}z_{t^{\prime}-1,i}\leq 1\quad\forall\,i\in\mathcal{I} (13b)
t=τimin{t,τ¯i}yt1,izt1,it𝒯{1},i^t.\displaystyle\sum_{t^{\prime}=\tau_{i}}^{\min\{t,\bar{\tau}_{i}\}}y_{t^{\prime}-1,i}\geq z_{t-1,i}\quad\forall\,t\in\mathcal{T}\setminus\{1\},\,i\in\widehat{\mathcal{I}}_{t}. (13c)

Constraints (13a) state that each uncertain parameter ii can only materialize between stages τi\tau_{i} and τ¯i\bar{\tau}_{i}; analogously, uncertain parameter ii can only be observed between stages τ^i\hat{\tau}_{i} and τ~i\tilde{\tau}_{i} according to (13b). Inequalities (13c) ensure that an uncertain parameter can only be observed after it has materialized.

6 Decision Rule Approach

In this section, we present a decision rule approach to (approximately) solve the general multistage robust optimization problem under endogenous uncertainty, which reduces to the two-stage case when T=2T=2. The proposed approach is an extension of the method presented in Feng et al. 48 and relies on the concept of lifted uncertainty 55. Here, the main innovation is the use of auxiliary uncertain parameters to model decision-dependent nonanticipativity.

For ease of exposition and for tractability reasons, we consider the case of fixed recourse, i.e. 𝑨tt\bm{A}_{tt^{\prime}}, 𝑫tt\bm{D}_{tt^{\prime}}, and 𝑯tt\bm{H}_{tt^{\prime}} do not depend on 𝝃[t]\bm{\xi}_{[t]}. In theory, the presented approach can be adapted to consider random recourse 11, however, at considerable additional computational cost. We further assume that 𝒃t(𝝃[t])=𝑩t𝝃[t]\bm{b}_{t}(\bm{\xi}_{[t]})=\bm{B}_{t}\bm{\xi}_{[t]} and 𝒳t=N¯t×{0,1}N^t\mathcal{X}_{t}=\mathbb{R}^{\bar{N}_{t}}\times\{0,1\}^{\widehat{N}_{t}}. Note that 𝒙t\bm{x}_{t} contain both continuous and binary variables, and we define 𝒙t=(𝒙tc,𝒙tb)\bm{x}_{t}=(\bm{x}^{\mathrm{c}}_{t},\bm{x}^{\mathrm{b}}_{t}) such that 𝒙tcN¯t\bm{x}^{\mathrm{c}}_{t}\in\mathbb{R}^{\bar{N}_{t}} and 𝒙tb{0,1}N^t\bm{x}^{\mathrm{b}}_{t}\in\{0,1\}^{\widehat{N}_{t}}.

6.1 Decision-Dependent Nonanticipativity

In problem (11), recourse variables can only be adjusted based on the values of observed uncertain parameters; this is referred to as nonanticipativity. As a result, any decision rule applied to a recourse variable can only be a function of uncertain parameters that have been observed up to the corresponding stage. Suppose we apply decision rules with a linear structure such that, for example, 𝒙t=𝑿~t𝝃[t]\bm{x}_{t}=\bm{\widetilde{X}}_{t}\bm{\xi}_{[t]}. Then, intuitively, one may attempt to model nonanticipavity by incorporating additional constraints that force decision rule parameters corresponding to unobserved uncertain parameters to be zero, e.g. Mz¯t1,i𝒆𝒙~tiMz¯t1,i𝒆-M\bar{z}_{t-1,i}\bm{e}\leq\bm{\tilde{x}}_{ti}\leq M\bar{z}_{t-1,i}\bm{e} with MM being a sufficiently large parameter. In the two-stage case, where 𝒛\bm{z} are only first-stage variables, this is a viable approach. However, in a multistage setting, 𝒛[t1]\bm{z}_{[t-1]} can themselves be adjustable recourse variables; in that case, the added constraints also require 𝑿~t\bm{\widetilde{X}}_{t} to be adjustable. Simply applying another decision rule to 𝑿~t\bm{\widetilde{X}}_{t} does not resolve the issue since the decision rule parameters for 𝑿~t\bm{\widetilde{X}}_{t} would also have to be adjustable because of their dependence on 𝒛[t1]\bm{z}_{[t-1]}. For fixed uncertainty sets, Vayanos et al. 38 propose an approach that divides the uncertainty set into a number of preselected subsets 𝒮\mathcal{S} and introduces recourse variables for each subset s𝒮s\in\mathcal{S}. As a result, constant 𝒛s\bm{z}_{s} can be selected for each subset ss, which allows restrictions to be imposed on the decision rule parameters for the remaining recourse variables such that nonanticipativity is satisfied. However, this approach is severely computationally intractable if the number of uncertainty subsets is large, which is only exacerbated in the case of problem (11) where the uncertainty set is generally not fixed.

We propose an alternative, more tractable approach to incorporate decision-dependent nonanticipativity. Notice that given a linear decision rule structure, the dependence of a recourse variable on an uncertain parameter is equally eliminated if, instead of setting the corresponding decision rule coefficients to zero, the uncertain parameter itself is set to zero. In the special case where an uncertain parameter can only materialize in one stage and its materialization directly leads to its observation (i.e. 𝒚=𝒛\bm{y}=\bm{z}), this can be easily incorporated by including the following inequalities in the definition of the uncertainty set Ξt\Xi_{t}:

𝝃tmin𝒚t1𝝃t𝝃tmax𝒚t1t=1,,t,\bm{\xi}^{\min}_{t^{\prime}}\circ\bm{y}_{t^{\prime}-1}\leq\bm{\xi}_{t^{\prime}}\leq\bm{\xi}^{\max}_{t^{\prime}}\circ\bm{y}_{t^{\prime}-1}\quad\forall\,t^{\prime}=1,\dots,t, (14)

which are valid since, as noted in the proof of Proposition 1, the values of unmaterialized parameters can be set to zero without loss of generality. However, inequalities (14) cannot capture the general case where there may be multiple stages in which an uncertain parameter can materialize or where despite being already materialized, one can still decide whether or not to observe the true value of the uncertain parameter in a given stage.

To address the general case, we first generalize (14) such that uncertain parameters that can materialize in multiple stages are considered:

ξτiimint=τimin{t,τ¯i}yt1,iξτiiξτiimaxt=τimin{t,τ¯i}yt1,iit=1t~t.\xi^{\min}_{\tau_{i}i}\sum_{t^{\prime}=\tau_{i}}^{\min\{t,\bar{\tau}_{i}\}}y_{t^{\prime}-1,i}\leq\xi_{\tau_{i}i}\leq\xi^{\max}_{\tau_{i}i}\sum_{t^{\prime}=\tau_{i}}^{\min\{t,\bar{\tau}_{i}\}}y_{t^{\prime}-1,i}\quad\forall\,i\in\bigcup_{t^{\prime}=1}^{t}\widetilde{\mathcal{I}}_{t^{\prime}}. (15)

We then introduce τ~iτ^i+1\tilde{\tau}_{i}-\hat{\tau}_{i}+1 auxiliary uncertain parameters ζτ^ii,,ζτ~ii\zeta_{\hat{\tau}_{i}i},\dots,\zeta_{\tilde{\tau}_{i}i} for every ii\in\mathcal{I}. Note that for an original uncertain parameter ii for which τi=τ~i\tau_{i}=\tilde{\tau}_{i} and yτi1,i=zτi1,iy_{\tau_{i}-1,i}=z_{\tau_{i}-1,i} (which corresponds to the above-mentioned special case), we do not actually require auxiliary uncertain parameters; however, for ease of notation, we assume that we have auxiliary uncertain parameters associated with every ii\in\mathcal{I}. We incorporate the following inequalities in the definition of the uncertainty set for stage tt:

ξτiiξτiimax(1zt1,i)ζtiξτii+M(1zt1,i)t=1,,t,i^t\displaystyle\xi_{\tau_{i}i}-\xi^{\max}_{\tau_{i}i}(1-z_{t^{\prime}-1,i})\leq\zeta_{t^{\prime}i}\leq\xi_{\tau_{i}i}+M(1-z_{t^{\prime}-1,i})\quad\forall\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}} (16a)
ξτiiminzt1,iζtiξτiimaxzt1,it=1,,t,i^t,\displaystyle\xi^{\min}_{\tau_{i}i}z_{t^{\prime}-1,i}\leq\zeta_{t^{\prime}i}\leq\xi^{\max}_{\tau_{i}i}z_{t^{\prime}-1,i}\quad\forall\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}}, (16b)

where MM is a sufficiently large big-M parameter. Inequalities (16) enforce that ζti\zeta_{t^{\prime}i} takes the value of ξτii\xi_{\tau_{i}i} if ξτii\xi_{\tau_{i}i} is observed in stage tt^{\prime}, and is zero otherwise.

With a slight abuse of notation, we define the following new uncertainty set that considers the original and the auxiliary uncertain parameters:

Θt(𝒙[t1],𝒚[t1],𝒛[t1])={(𝝃[t],𝜻[t])t=1tKt+|^t|:t=1t[𝑾tt𝝃t+𝑾¯tt𝜻t]\displaystyle\Theta_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-1]})=\left\{(\bm{\xi}_{[t]},\bm{\zeta}_{[t]})\in\mathbb{R}^{\sum_{t^{\prime}=1}^{t}K_{t^{\prime}}+|\widehat{\mathcal{I}}_{t^{\prime}}|}:\sum_{t^{\prime}=1}^{t}\left[\bm{W}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}+\bm{\overline{W}}_{tt^{\prime}}\bm{\zeta}_{t^{\prime}}\right]\right.
t=1t1[𝑼ttc𝒙tc(𝜻[t])+𝑼ttb𝒙tb(𝜻[t])+𝑼¯tt𝒚t(𝜻[t])+𝑼^tt𝒛t(𝜻[t])]},\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\leq\left.\sum\limits_{t^{\prime}=1}^{t-1}\left[\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{x}^{\mathrm{c}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{x}^{\mathrm{b}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\overline{U}}_{tt^{\prime}}\bm{y}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\widehat{U}}_{tt^{\prime}}\bm{z}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})\right]\right\},

where 𝑾ttQt×Kt\bm{W}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times K_{t^{\prime}}}, 𝑾¯ttQt×|^t|\bm{\overline{W}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times|\widehat{\mathcal{I}}_{t^{\prime}}|}, 𝑼ttcQt×N¯t\bm{U}^{\mathrm{c}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times\bar{N}_{t^{\prime}}}, 𝑼ttbQt×N^t\bm{U}^{\mathrm{b}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times\widehat{N}_{t^{\prime}}}, 𝑼¯ttQt×|¯t+1|\bm{\overline{U}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times|\bar{\mathcal{I}}_{t^{\prime}+1}|}, and 𝑼^ttQt×|^t+1|\bm{\widehat{U}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times|\widehat{\mathcal{I}}_{t^{\prime}+1}|} are chosen to incorporate the inequalities from the original uncertainty set (12) as well as inequalities (15) and (16). Notice that the recourse variables 𝒙tc\bm{x}^{\mathrm{c}}_{t}, 𝒙tb\bm{x}^{\mathrm{b}}_{t}, 𝒚t\bm{y}_{t}, and 𝒛t\bm{z}_{t} now depend on 𝜻[t]\bm{\zeta}_{[t]}, which represent the uncertain parameters observed up to stage tt.

6.2 Lifted Uncertainty

For each auxiliary uncertain parameter ζti\zeta_{ti}, we introduce two sets of new uncertain parameters and hence create a higher-dimensional lifted uncertainty space, which will facilitate the design of flexible decision rules. To this end, the bounded marginal support of ζti\zeta_{ti}, which is the same as the one of ξτii\xi_{\tau_{i}i}, is partitioned into rir_{i} pieces defined by its lower and upper bounds as well as ri1r_{i}-1 breakpoints, and we denote these points by pijp_{i}^{j} with j=0,1,,rij=0,1,\dots,r_{i} such that

ξτiimin=pi0<pi1<<piri1<piri=ξτiimax.\xi_{\tau_{i}i}^{\min}=p_{i}^{0}<p_{i}^{1}<\cdots<p_{i}^{r_{i}-1}<p_{i}^{r_{i}}=\xi_{\tau_{i}i}^{\max}.

We then define two lifting operators, L¯t:|^t|K¯t\bar{L}_{t}:\mathbb{R}^{|\widehat{\mathcal{I}}_{t}|}\mapsto\mathbb{R}^{\overline{K}_{t}} and L^t:|^t|{0,1}K^t\widehat{L}_{t}:\mathbb{R}^{|\widehat{\mathcal{I}}_{t}|}\mapsto\{0,1\}^{\widehat{K}_{t}}, for each t𝒯t\in\mathcal{T}. The first lifting operator L¯t(𝜻t)\bar{L}_{t}(\bm{\zeta}_{t}) maps 𝜻t\bm{\zeta}_{t} onto a K¯t\overline{K}_{t}-dimensional space with K¯t=i^tri\overline{K}_{t}=\sum_{i\in\widehat{\mathcal{I}}_{t}}r_{i} such that each lifted parameter is a piecewise linear function of the corresponding ζti\zeta_{ti} and is defined as follows:

L¯tij(𝜻t):={ζtiifri=1inf{ζti,pij}ifri>1,j=1sup{inf{ζti,pij}pij1,0}ifri>1,j=2,,ri.\bar{L}_{ti}^{j}(\bm{\zeta}_{t}):=\left\{\begin{array}[]{lcl}{\zeta_{ti}}&\text{if}&r_{i}=1\\ {\inf\left\{\zeta_{ti},p_{i}^{j}\right\}}&\text{if}&r_{i}>1,\,j=1\\ {\sup\left\{\inf\left\{\zeta_{ti},p_{i}^{j}\right\}-p_{i}^{j-1},0\right\}}&\text{if}&r_{i}>1,\,j=2,\ldots,r_{i}.\end{array}\right.

By construction, we have ζti=j=1riL¯tij(𝜻t)\zeta_{ti}=\sum_{j=1}^{r_{i}}\bar{L}_{ti}^{j}(\bm{\zeta}_{t}). The second lifting operator L^t(𝜻t)\widehat{L}_{t}(\bm{\zeta}_{t}) maps 𝜻t\bm{\zeta}_{t} onto a K^t\widehat{K}_{t}-dimensional space with K^t=i^tgi\widehat{K}_{t}=\sum_{i\in\widehat{\mathcal{I}}_{t}}g_{i} and gi=max{1,ri1}g_{i}=\max\{1,\,r_{i}-1\}. It is defined such that

L^tij(𝜻t):={1ifri=1𝟙(ζtipij)ifri>1,j=1,,gi,\widehat{L}_{ti}^{j}(\bm{\zeta}_{t}):=\left\{\begin{array}[]{lcl}1&\text{if}&r_{i}=1\\ {\mathbbm{1}(\zeta_{ti}\geq p_{i}^{j})}&\text{if}&r_{i}>1,\,j=1,\ldots,g_{i},\end{array}\right.

which is a piecewise constant function of ζti\zeta_{ti}.

Using the lifting operators, we can define the following lifted endogenous uncertainty set:

Θ~t(𝒙[t1],𝒚[t1],𝒛[t1])={(𝝃[t],𝜻[t],𝜻¯[t],𝜻^[t])K~[t]:\displaystyle\widetilde{\Theta}^{\prime}_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-1]})=\left\{(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\mathbb{R}^{\widetilde{K}_{[t]}}:\vphantom{\begin{array}[]{l}\sum\limits_{t^{\prime}=1}^{t}\\[8.0pt] \bm{\bar{\zeta}}_{t^{\prime}}\end{array}}\right.
t=1t[𝑾tt𝝃t+𝑾¯tt𝜻t]t=1t1[𝑼ttc𝒙tc(𝜻[t])+𝑼ttb𝒙tb(𝜻[t])+𝑼¯tt𝒚t(𝜻[t])+𝑼^tt𝒛t(𝜻[t])]𝜻¯t=L¯t(𝜻t),𝜻^t=L^t(𝜻t)t=1,,t},\displaystyle\quad\left.\begin{array}[]{l}\sum\limits_{t^{\prime}=1}^{t}\left[\bm{W}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}+\bm{\overline{W}}_{tt^{\prime}}\bm{\zeta}_{t^{\prime}}\right]\leq\sum\limits_{t^{\prime}=1}^{t-1}\left[\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{x}^{\mathrm{c}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{x}^{\mathrm{b}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\overline{U}}_{tt^{\prime}}\bm{y}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\widehat{U}}_{tt^{\prime}}\bm{z}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})\right]\\[8.0pt] \bm{\bar{\zeta}}_{t^{\prime}}=\bar{L}_{t^{\prime}}\left(\bm{\zeta}_{t^{\prime}}\right),\;\bm{\hat{\zeta}}_{t^{\prime}}=\widehat{L}_{t^{\prime}}\left(\bm{\zeta}_{t^{\prime}}\right)\quad\forall\,t^{\prime}=1,\dots,t\end{array}\right\},

where K~[t]=t=1tKt+|^t|+K¯t+K^t\widetilde{K}_{[t]}=\sum_{t^{\prime}=1}^{t}K_{t^{\prime}}+|\widehat{\mathcal{I}}_{t^{\prime}}|+\overline{K}_{t^{\prime}}+\widehat{K}_{t^{\prime}}, and for all i^ti\in\widehat{\mathcal{I}}_{t}, 𝜻¯ti=(ζ¯ti1,,ζ¯tiri)\bm{\bar{\zeta}}_{ti}=(\bar{\zeta}_{ti}^{1},\ldots,\bar{\zeta}_{ti}^{r_{i}}), L¯ti=(L¯ti1,,L¯tiri)\bar{L}_{ti}=(\bar{L}_{ti}^{1},\ldots,\bar{L}_{ti}^{r_{i}}), 𝜻^ti=(ζ^ti1,,ζ^tigi)\bm{\hat{\zeta}}_{ti}=(\hat{\zeta}_{ti}^{1},\ldots,\hat{\zeta}_{ti}^{g_{i}}), and L^ti=(L^ti1,,L^tigi)\widehat{L}_{ti}=(\widehat{L}_{ti}^{1},\ldots,\widehat{L}_{ti}^{g_{i}}). The main purpose of defining such a lifted uncertainty set is to enable the construction of linear decision rules in the space of the lifted uncertain parameters. Notice that Θ~t\widetilde{\Theta}^{\prime}_{t} is an open set due to the discontinuity in L^\widehat{L}. Now consider problem (11) with Ξt\Xi_{t} replaced by Θ~t\widetilde{\Theta}^{\prime}_{t} and the recourse variables restricted to be linear functions of 𝜻¯\bm{\bar{\zeta}} and 𝜻^\bm{\hat{\zeta}}, and a second problem that is the same except that it considers the convex hull of the closure of Θ~t\widetilde{\Theta}^{\prime}_{t}, i.e. conv(cl(Θ~t))\mathrm{conv}(\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t})), instead of Θ~t\widetilde{\Theta}^{\prime}_{t}. Following the same arguments as in Bertsimas and Georghiou 11, one can show that these two problems are equivalent in a sense that they have the same optimal value and that there is a one-to-one mapping between their feasible and optimal solutions.

It is difficult to obtain a closed-form description of conv(cl(Θ~t))\mathrm{conv}(\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t})) as it requires finding all vertices of cl(Θ~t)\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t}), which are decision-dependent. Moreover, the number of vertices grows exponentially with the dimensionality of cl(Θ~t)\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t}) such that including them would render most problems of practical relevance computationally intractable. Therefore, instead of conv(cl(Θ~t))\mathrm{conv}(\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t})), we consider an outer approximation, which we construct as follows. For each t𝒯t\in\mathcal{T}, i^ti\in\widehat{\mathcal{I}}_{t}, and j=1,,rij=1,\dots,r_{i}, we define a set of vertices 𝒱~tij\widetilde{\mathcal{V}}_{ti}^{j} in the lifted space:

𝒱~tij:={𝒗~=(v,𝒗¯,𝒗^):v{pij1,pij}𝒗¯=L¯ti(𝒆tiv)𝒗^=limζtiv,ζti[pij1,pij]L^ti(𝒆tiζti)},\widetilde{\mathcal{V}}_{ti}^{j}:=\left\{\bm{\tilde{v}}=\left(v,\,\bm{\bar{v}},\,\bm{\hat{v}}\right):\begin{array}[]{l}v\in\{p_{i}^{j-1},p_{i}^{j}\}\\ \bm{\bar{v}}=\bar{L}_{ti}(\bm{e}_{ti}v)\\ \bm{\hat{v}}=\lim\limits_{\zeta_{ti}\rightarrow v,\,\zeta_{ti}\in[p_{i}^{j-1},p_{i}^{j}]}\widehat{L}_{ti}(\bm{e}_{ti}\zeta_{ti})\end{array}\right\},

which allows the following convex hull formulation of the closure of the marginal support of 𝜻~ti\bm{\tilde{\zeta}}_{ti}:

Zti={𝜻~ti=(ζti,𝜻¯ti,𝜻^ti):𝒗~𝒱~tiλ(𝒗~)=1ζti=𝒗~𝒱~tiλ(𝒗~)v𝜻¯ti=𝒗~𝒱~tiλ(𝒗~)𝒗¯𝜻^ti=𝒗~𝒱~tiλ(𝒗~)𝒗^λ(𝒗~)+𝒗~𝒱~ti},Z_{ti}=\left\{\bm{\tilde{\zeta}}_{ti}=(\zeta_{ti},\bm{\bar{\zeta}}_{ti},\bm{\hat{\zeta}}_{ti}):\begin{array}[]{l}\sum\limits_{\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}}\lambda(\bm{\tilde{v}})=1\\ \zeta_{ti}=\sum\limits_{\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}}\lambda(\bm{\tilde{v}})v\\ \bm{\bar{\zeta}}_{ti}=\sum\limits_{\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}}\lambda(\bm{\tilde{v}})\bm{\bar{v}}\\ \bm{\hat{\zeta}}_{ti}=\sum\limits_{\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}}\lambda(\bm{\tilde{v}})\bm{\hat{v}}\\ \lambda(\bm{\tilde{v}})\in\mathbb{R}_{+}\quad\forall\,\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}\end{array}\right\},

where 𝒱~ti=j=1ri𝒱~tij\widetilde{\mathcal{V}}_{ti}=\bigcup_{j=1}^{r_{i}}\widetilde{\mathcal{V}}_{ti}^{j} and λ(𝒗~)\lambda(\bm{\tilde{v}}) denotes the coefficient associated with a particular vertex 𝒗~𝒱~ti\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}. We can now obtain the following closed-form expression of an outer approximation of conv(cl(Ξ~t))\mathrm{conv}(\mathrm{cl}(\widetilde{\Xi}^{\prime}_{t})):

{{𝝃[t]}×t=1ti^tZti}{(𝝃[t],𝜻[t],𝜻¯[t],𝜻^[t])K~[t]:t=1t[𝑾tt𝝃t+𝑾¯tt𝜻t]\displaystyle\left\{\{\bm{\xi}_{[t]}\}\times\prod_{t^{\prime}=1}^{t}\prod_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}Z_{t^{\prime}i}\right\}\bigcap\left\{(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\mathbb{R}^{\widetilde{K}_{[t]}}:\sum\limits_{t^{\prime}=1}^{t}\left[\bm{W}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}+\bm{\overline{W}}_{tt^{\prime}}\bm{\zeta}_{t^{\prime}}\right]\right. (17)
t=1t1[𝑼ttc𝒙tc(𝜻[t])+𝑼ttb𝒙tb(𝜻[t])+𝑼¯tt𝒚t(𝜻[t])+𝑼^tt𝒛t(𝜻[t])]}.\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\leq\left.\sum\limits_{t^{\prime}=1}^{t-1}\left[\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{x}^{\mathrm{c}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{x}^{\mathrm{b}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\overline{U}}_{tt^{\prime}}\bm{y}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\widehat{U}}_{tt^{\prime}}\bm{z}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})\right]\right\}.

6.3 Decision Rule Approximation

Following a similar approach as in Feng et al. (2020), we apply decision rules that are linear functions of the lifted uncertain parameters and parameterized as follows:

𝒙tc=t=1ti^t(𝑿¯ttic𝜻¯ti+𝑿^ttic𝜻^ti)\displaystyle\bm{x}^{\mathrm{c}}_{t}=\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i}\bm{\bar{\zeta}}_{t^{\prime}i}+\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i}\bm{\hat{\zeta}}_{t^{\prime}i}\right) (18a)
𝒙tb=t=1ti^t𝑿^ttib𝜻^ti,𝒚t=t=1ti^t𝒀^tti𝜻^ti,𝒛t=t=1ti^t𝒁^tti𝜻^ti,\displaystyle\bm{x}^{\mathrm{b}}_{t}=\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i}\bm{\hat{\zeta}}_{t^{\prime}i},\quad\bm{y}_{t}=\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\bm{\widehat{Y}}_{tt^{\prime}i}\bm{\hat{\zeta}}_{t^{\prime}i},\quad\bm{z}_{t}=\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\bm{\widehat{Z}}_{tt^{\prime}i}\bm{\hat{\zeta}}_{t^{\prime}i}, (18b)

where 𝑿¯tticN¯t×ri\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i}\in\mathbb{R}^{\bar{N}_{t}\times r_{i}}, 𝑿^tticN¯t×gi\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i}\in\mathbb{R}^{\bar{N}_{t}\times g_{i}}, 𝑿^ttib{1, 0, 1}N^t×gi\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{\widehat{N}_{t}\times g_{i}}, 𝒀^tti{1, 0, 1}|¯t+1|×gi\bm{\widehat{Y}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{|\bar{\mathcal{I}}_{t+1}|\times g_{i}}, and 𝒁^tti{1, 0, 1}|^t+1|×gi\bm{\widehat{Z}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{|\widehat{\mathcal{I}}_{t+1}|\times g_{i}}. The decision rules for the continuous variables in (18a) can form continuous or discontinuous piecewise linear functions of 𝜻[t]\bm{\zeta}_{[t]}. With the given domains for 𝑿^ttib\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i}, 𝒀^tti\bm{\widehat{Y}}_{tt^{\prime}i}, and 𝒁^tti\bm{\widehat{Z}}_{tt^{\prime}i}, a variable following a decision rule of the form (18b) is guaranteed to be binary, i.e. can only take the value 0 or 1, if we also constrain it to be in [0,1][0,1] 11. This property implies that the integrality constraints on the binary variables can be relaxed when these decision rules are applied.

By applying the proposed decision rules in conjunction with the uncertainty set in (17), decision-dependent nonanticipativity is captured, and we arrive at the following approximation of the multistage robust optimization problem under endogenous uncertainty:

minimize𝑿¯c,𝑿^c,𝑿^b,𝒀^,𝒁^\displaystyle\operatorname*{minimize}_{\begin{subarray}{c}\bm{\overline{X}}^{\mathrm{c}},\,\bm{\widehat{X}}^{\mathrm{c}},\\ \bm{\widehat{X}}^{\mathrm{b}},\,\bm{\widehat{Y}},\,\bm{\widehat{Z}}\end{subarray}} 𝒆1(𝑿¯111c+𝑿^111c)\displaystyle\bm{e}_{1}^{\top}\left(\bm{\overline{X}}^{\mathrm{c}}_{111}+\bm{\widehat{X}}^{\mathrm{c}}_{111}\right) (19)
subjectto\displaystyle\mathrm{subject\;to} (𝑿¯ttic,𝑿^ttic,𝑿^ttib,𝒀^tti,𝒁^tti)Ωttit𝒯,t=1,,t,i^t\displaystyle\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i},\bm{\widehat{Y}}_{tt^{\prime}i},\bm{\widehat{Z}}_{tt^{\prime}i}\right)\in\Omega_{tt^{\prime}i}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}}
t=1tt′′=1ti^t′′[𝑨ttc(𝑿¯tt′′ic𝜻¯t′′i+𝑿^tt′′ic𝜻^t′′i)+𝑨ttb𝑿^tt′′ib𝜻^t′′i+𝑫tt𝒀^tt′′i𝜻^t′′i+𝑯tt𝒁^tt′′i𝜻^t′′i]𝑩t𝝃[t]𝟎t′′=1ti^t′′𝑿^tt′′ib𝜻^t′′i𝒆t=1,,t𝟎t′′=1ti^t′′𝒀^tt′′i𝜻^t′′i𝒆t=1,,t𝟎t′′=1ti^t′′𝒁^tt′′i𝜻^t′′i𝒆t=1,,t}\displaystyle\left.\begin{array}[]{l}\sum\limits_{t^{\prime}=1}^{t}\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\left[\bm{A}^{\mathrm{c}}_{tt^{\prime}}\left(\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{\zeta}}_{t^{\prime\prime}i}+\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right)+\bm{A}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right.\\ \quad\quad\quad\quad\quad\quad\quad+\left.\bm{D}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}+\bm{H}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right]\leq\bm{B}_{t}\bm{\xi}_{[t]}\\[8.0pt] \bm{0}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\leq\bm{e}\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \bm{0}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\leq\bm{e}\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \bm{0}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\leq\bm{e}\quad\forall\,t^{\prime}=1,\dots,t\end{array}\right\}
t𝒯,(𝝃[t],𝜻[t],𝜻¯[t],𝜻^[t])Θ~t(𝑿¯[t1]c,𝑿^[t1]c,𝑿^[t1]b,𝒀^[t1],𝒁^[t1]),\displaystyle\quad\quad\quad\quad\quad\forall\,t\in\mathcal{T},\,(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\widetilde{\Theta}_{t}\left(\bm{\overline{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{b}}_{[t-1]},\bm{\widehat{Y}}_{[t-1]},\bm{\widehat{Z}}_{[t-1]}\right),

where

Ωtti={(𝑿¯ttic,𝑿^ttic,𝑿^ttib,𝒀^tti,𝒁^tti):𝑿¯tticN¯t×ri𝑿^tticN¯t×gi𝑿^ttib{1, 0, 1}N^t×gi𝒀^tti{1, 0, 1}|¯t+1|×gi𝒁^tti{1, 0, 1}|^t+1|×gi},\Omega_{tt^{\prime}i}=\left\{\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i},\bm{\widehat{Y}}_{tt^{\prime}i},\bm{\widehat{Z}}_{tt^{\prime}i}\right):\begin{array}[]{l}\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i}\in\mathbb{R}^{\bar{N}_{t}\times r_{i}}\\[4.0pt] \bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i}\in\mathbb{R}^{\bar{N}_{t}\times g_{i}}\\[4.0pt] \bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{\widehat{N}_{t}\times g_{i}}\\[4.0pt] \bm{\widehat{Y}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{|\bar{\mathcal{I}}_{t+1}|\times g_{i}}\\[4.0pt] \bm{\widehat{Z}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{|\widehat{\mathcal{I}}_{t+1}|\times g_{i}}\end{array}\right\},

and 𝑨ttc\bm{A}^{\mathrm{c}}_{tt^{\prime}} and 𝑨ttb\bm{A}^{\mathrm{b}}_{tt^{\prime}} are appropriately defined constraint matrices associated with the corresponding continuous and binary 𝒙\bm{x}-variables, respectively. The uncertainty set is

Θ~t(𝑿¯[t1]c,𝑿^[t1]c,𝑿^[t1]b,𝒀^[t1],𝒁^[t1])\displaystyle\widetilde{\Theta}_{t}\left(\bm{\overline{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{b}}_{[t-1]},\bm{\widehat{Y}}_{[t-1]},\bm{\widehat{Z}}_{[t-1]}\right)
={{𝝃[t]}×t=1ti^tZti}Θ^t(𝑿¯[t1]c,𝑿^[t1]c,𝑿^[t1]b,𝒀^[t1],𝒁^[t1])\displaystyle\quad\quad\quad=\left\{\{\bm{\xi}_{[t]}\}\times\prod_{t^{\prime}=1}^{t}\prod_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}Z_{t^{\prime}i}\right\}\bigcap\,\widehat{\Theta}_{t}\left(\bm{\overline{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{b}}_{[t-1]},\bm{\widehat{Y}}_{[t-1]},\bm{\widehat{Z}}_{[t-1]}\right)

with

Θ^t\displaystyle\widehat{\Theta}_{t} ={(𝝃[t],𝜻[t],𝜻¯[t],𝜻^[t])K~[t]:t=1t[𝑾tt𝝃t+𝑾¯tt𝜻t]\displaystyle=\left\{(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\mathbb{R}^{\widetilde{K}_{[t]}}:\sum_{t^{\prime}=1}^{t}\left[\bm{W}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}+\bm{\overline{W}}_{tt^{\prime}}\bm{\zeta}_{t^{\prime}}\right]\right.
t=1t1t′′=1ti^t′′[𝑼ttc(𝑿¯tt′′ic𝜻¯t′′i+𝑿^tt′′ic𝜻^t′′i)+𝑼ttb𝑿^tt′′ib𝜻^t′′i+𝑼¯tt𝒀^tt′′i𝜻^t′′i+𝑼^tt𝒁^tt′′i𝜻^t′′i]}.\displaystyle\quad\quad\leq\left.\sum\limits_{t^{\prime}=1}^{t-1}\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\left[\bm{U}^{\mathrm{c}}_{tt^{\prime}}\left(\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{\zeta}}_{t^{\prime\prime}i}+\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right)+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}+\bm{\overline{U}}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}+\bm{\widehat{U}}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right]\right\}.

For ease of exposition, we define new constraint and right-hand-side matrices such that we can express problem (19) in the following compact form:

minimize𝑿¯c,𝑿^c,𝑿^b,𝒀^,𝒁^\displaystyle\operatorname*{minimize}_{\begin{subarray}{c}\bm{\overline{X}}^{\mathrm{c}},\,\bm{\widehat{X}}^{\mathrm{c}},\\ \bm{\widehat{X}}^{\mathrm{b}},\,\bm{\widehat{Y}},\,\bm{\widehat{Z}}\end{subarray}} 𝒆1(𝑿¯111c+𝑿^111c)\displaystyle\bm{e}_{1}^{\top}\left(\bm{\overline{X}}^{\mathrm{c}}_{111}+\bm{\widehat{X}}^{\mathrm{c}}_{111}\right) (20)
subjectto\displaystyle\mathrm{subject\;to} t=1tt′′=1ti^t′′[𝑨~ttc𝑿¯tt′′ic𝜻¯t′′i+(𝑨~ttc𝑿^tt′′ic+𝑨~ttb𝑿^tt′′ib+𝑫~tt𝒀^tt′′i+𝑯~tt𝒁^tt′′i)𝜻^t′′i]\displaystyle\sum\limits_{t^{\prime}=1}^{t}\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\left[\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{\zeta}}_{t^{\prime\prime}i}+\left(\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{A}}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{D}}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{H}}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\right)\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right]
t=1t𝑩~tt𝝃tt𝒯,(𝝃[t],𝜻[t],𝜻¯[t],𝜻^[t])Θ~t(𝑿¯[t1]c,𝑿^[t1]c,𝑿^[t1]b,𝒀^[t1],𝒁^[t1])\displaystyle\quad\quad\quad\leq\sum_{t^{\prime}=1}^{t}\bm{\widetilde{B}}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}\quad\forall\,t\in\mathcal{T},\,(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\widetilde{\Theta}_{t}\left(\bm{\overline{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{b}}_{[t-1]},\bm{\widehat{Y}}_{[t-1]},\bm{\widehat{Z}}_{[t-1]}\right)
(𝑿¯ttic,𝑿^ttic,𝑿^ttib,𝒀^tti,𝒁^tti)Ωttit𝒯,t=1,,t,i^t,\displaystyle\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i},\bm{\widehat{Y}}_{tt^{\prime}i},\bm{\widehat{Z}}_{tt^{\prime}i}\right)\in\Omega_{tt^{\prime}i}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}},

where 𝑨~ttcPt×N¯t\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times\bar{N}_{t^{\prime}}}, 𝑨~ttbPt×N^t\bm{\widetilde{A}}^{\mathrm{b}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times\widehat{N}_{t^{\prime}}}, 𝑫~ttPt×|¯t+1|\bm{\widetilde{D}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times|\bar{\mathcal{I}}_{t^{\prime}+1}|}, 𝑯~ttPt×|^t+1|\bm{\widetilde{H}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times|\widehat{\mathcal{I}}_{t^{\prime}+1}|}, and 𝑩~ttPt×Kt\bm{\widetilde{B}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times K_{t^{\prime}}}.

6.4 Reformulation

Applying standard robust optimization techniques based on constraint-wise worst-case reformulation and linear programming (LP) duality 56, the semi-infinite program (20) can be reformulated into the following problem:

minimize𝑿¯c,𝑿^c,𝑿^b,𝒀^,𝒁^,𝚽,𝚿\displaystyle\operatorname*{minimize}_{\begin{subarray}{c}\bm{\overline{X}}^{\mathrm{c}},\,\bm{\widehat{X}}^{\mathrm{c}},\,\bm{\widehat{X}}^{\mathrm{b}},\\ \bm{\widehat{Y}},\,\bm{\widehat{Z}},\,\bm{\Phi},\,\bm{\Psi}\end{subarray}} 𝒆1(𝑿¯111c+𝑿^111c)\displaystyle\bm{e}_{1}^{\top}\left(\bm{\overline{X}}^{\mathrm{c}}_{111}+\bm{\widehat{X}}^{\mathrm{c}}_{111}\right) (21)
subjectto\displaystyle\mathrm{subject\;to} t=1ti^t𝝍tti𝟎t𝒯\displaystyle\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\bm{\psi}_{tt^{\prime}i}\leq\bm{0}\quad\forall\,t\in\mathcal{T}
𝚽t𝒘tti=𝒃~ttit𝒯,t=1,,t,i~t\displaystyle\bm{\Phi}_{t}\bm{w}_{tt^{\prime}i}=-\bm{\tilde{b}}_{tt^{\prime}i}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widetilde{\mathcal{I}}_{t^{\prime}}
𝝍tt′′i+𝚽t[𝒘¯tt′′ivt′′it=t′′t1𝑼ttc𝑿¯tt′′ic𝒗¯t′′i\displaystyle\bm{\psi}_{tt^{\prime\prime}i}+\bm{\Phi}_{t}\left[\bm{\bar{w}}_{tt^{\prime\prime}i}v_{t^{\prime\prime}i}-\sum\limits_{t^{\prime}=t^{\prime\prime}}^{t-1}\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{v}}_{t^{\prime\prime}i}\right.
t=t′′t1(𝑼ttc𝑿^tt′′ic+𝑼ttb𝑿^tt′′ib+𝑼¯tt𝒀^tt′′i+𝑼^tt𝒁^tt′′i)𝒗^t′′i]\displaystyle\quad\quad\quad\quad\quad\quad-\left.\sum\limits_{t^{\prime}=t^{\prime\prime}}^{t-1}\left(\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}+\bm{\overline{U}}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widehat{U}}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\right)\bm{\hat{v}}_{t^{\prime\prime}i}\right]
t=t′′t[𝑨~ttc𝑿¯tt′′ic𝒗¯t′′i+(𝑨~ttc𝑿^tt′′ic+𝑨~ttb𝑿^tt′′ib+𝑫~tt𝒀^tt′′i+𝑯~tt𝒁^tt′′i)𝒗^t′′i]\displaystyle\quad\quad\geq\sum\limits_{t^{\prime}=t^{\prime\prime}}^{t}\left[\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{v}}_{t^{\prime\prime}i}+\left(\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{A}}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{D}}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{H}}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\right)\bm{\hat{v}}_{t^{\prime\prime}i}\right]
t𝒯,t′′=1,,t,i^t′′,𝒗~t′′i𝒱~t′′i\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\forall\,t\in\mathcal{T},\,t^{\prime\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime\prime}},\,\bm{\tilde{v}}_{t^{\prime\prime}i}\in\widetilde{\mathcal{V}}_{t^{\prime\prime}i}
𝝍ttiPtt𝒯,t=1,,t,i^t\displaystyle\bm{\psi}_{tt^{\prime}i}\in\mathbb{R}^{P_{t}}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}}
𝚽t+Pt×Qtt𝒯\displaystyle\bm{\Phi}_{t}\in\mathbb{R}_{+}^{P_{t}\times Q_{t}}\quad\forall\,t\in\mathcal{T}
(𝑿¯ttic,𝑿^ttic,𝑿^ttib,𝒀^tti,𝒁^tti)Ωttit𝒯,t=1,,t,i^t\displaystyle\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i},\bm{\widehat{Y}}_{tt^{\prime}i},\bm{\widehat{Z}}_{tt^{\prime}i}\right)\in\Omega_{tt^{\prime}i}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}}

where 𝒘tti\bm{w}_{tt^{\prime}i}, 𝒘¯tti\bm{\bar{w}}_{tt^{\prime}i} 𝒃~tti\bm{\tilde{b}}_{tt^{\prime}i}, and 𝝍ti\bm{\psi}_{ti} denote the columns of 𝑾tt\bm{W}_{tt^{\prime}}, 𝑾¯tt\bm{\overline{W}}_{tt^{\prime}}, 𝑩~tt\bm{\widetilde{B}}_{tt^{\prime}}, and 𝚿t\bm{\Psi}_{t}, respectively, corresponding to uncertain parameter ii. The detailed derivation of the reformulation can be found in the supplementary material.

Problem (21) is generally a mixed-integer quadratically constrained program (MIQCP) as it involves bilinear terms. However, if the endogenous uncertainty set only depends on binary variables, the problem can be reformulated into a mixed-integer linear program (MILP) by expressing each integer variable as the difference between two binary variables and linearizing the bilinear terms, which will all be products of a continuous and a binary variable (for more details, see Feng et al. (2020)).

7 Computational Case Studies

In this section, we present the results from four computational case studies that demonstrate the versatility of the proposed modeling framework. All model instances were implemented in Julia v1.3.1 using the modeling environment JuMP v0.21.2 57 and solved using Gurobi v8.1.1 on a Intel Core i7-8700 CPU at 3.20 GHz machine with 8 GB RAM.

7.1 Pilot Plant and Plant Redesign

We first consider an illustrative example that is based on Example 1. In this problem, we have a design of an industrial-scale plant whose capacity qq is uncertain and only becomes known after the plant is built. However, as described in Example 1, a pilot plant can be built such that its capacity pp helps provide a more accurate prediction of qq. Once pp is observed, the marginal uncertainty set for qq is updated, based on which we can decide to build one or two plants using the current design or redesign the plant and build a plant using the new design. The solution has to ensure that a given demand dd can be met, and the objective is to optimize the worst case. The associated robust optimization problem can be formulated as follows:

minimizex,y,z1,z2\displaystyle\operatorname*{minimize}_{x,y,z_{1},z_{2}} δx+max(p,q,γ)Ξ(x)γy(xp)+ρ[z1(xp)+z2(xp)]\displaystyle\delta x+\max_{(p,q,\gamma)\in\Xi(x)}\gamma y(xp)+\rho\left[z_{1}(xp)+z_{2}(xp)\right] (22)
subjectto\displaystyle\mathrm{subject\;to} x{0,1}\displaystyle x\in\{0,1\}
y(xp)xdq[z1(xp)+z2(xp)]+dy(xp)y(xp),z1(xp),z2(xp){0,1}}(p,q,γ)Ξ(x),\displaystyle\left.\begin{array}[]{l}y(xp)\leq x\\[4.0pt] d\leq q\left[z_{1}(xp)+z_{2}(xp)\right]+dy(xp)\\[4.0pt] y(xp),\,z_{1}(xp),\,z_{2}(xp)\in\{0,1\}\end{array}\right\}\quad\forall\,(p,q,\gamma)\in\Xi(x),

where the binary variable xx equals 1 if the pilot plant is built and 0 otherwise, yy equals 1 if a plant with a new design and capacity dd is built, and z1z_{1} and z2z_{2} indicate how many (maximum two) plants with the current design are built. Nonnegative cost coefficients in the cost function are denoted by δ\delta, γ\gamma, and ρ\rho, where γ\gamma is an uncertain parameter as it depends on the pilot plant capacity. As indicated by the constraint y(xp)xy(xp)\leq x, a plant redesign is only possible if a pilot plant is built and hence more process knowledge is available. Finally, the total plant capacity has to be greater than or equal to demand dd. The uncertainty set depends on xx and can be expressed as follows:

Ξ(x)={(p,q,γ)+3:pminxppmaxxqmin(1x)+(αΔ)x+βpqqmax(1x)+(α+Δ)x+βpγ=(ρ+γ~pmax)xγ~p},\Xi(x)=\left\{(p,q,\gamma)\in\mathbb{R}_{+}^{3}:\begin{array}[]{l}p^{\min}x\leq p\leq p^{\max}x\\[4.0pt] q^{\min}(1-x)+(\alpha-\Delta)x+\beta p\leq q\leq q^{\max}(1-x)+(\alpha+\Delta)x+\beta p\\[4.0pt] \gamma=(\rho+\tilde{\gamma}p^{\max})x-\tilde{\gamma}p\end{array}\right\},

where pp and γ\gamma can only take nonzero values if x=1x=1. Similarly, (αΔ)+βpq(α+Δ)+βp(\alpha-\Delta)+\beta p\leq q\leq(\alpha+\Delta)+\beta p if x=1x=1, and qminqqmaxq^{\min}\leq q\leq q^{\max} if x=0x=0. The parameters are given such that qmin=(αΔ)+βpminq^{\min}=(\alpha-\Delta)+\beta p^{\min} and qmax=(α+Δ)+βpmaxq^{\max}=(\alpha+\Delta)+\beta p^{\max}. We further assume that qmin<d2qminq^{\min}<d\leq 2q^{\min} and that there exists a p^<pmax\hat{p}<p^{\max} such that d=(αΔ)+βp^d=(\alpha-\Delta)+\beta\hat{p}.

7.1.1 Analytical Solution

Problem (22) can be solved analytically, and the solution is illustrated in the decision diagram shown in Figure 4. If we choose to build a pilot plant (x=1x=1), we can take recourse actions (yy, z1z_{1}, and z2z_{2}) based on the realization of pp. If pp^p\geq\hat{p}, we know that qdq\geq d; hence, we only have to build one plant using the current design. If p<p^p<\hat{p}, building one plant with the current design may not be sufficient to meet demand dd. In this case, we choose the less expensive option between building one plant with a new design that achieves a capacity of dd and building two plants with the current design. The costs for these two choices are v=δ+ρ+γ~(pmaxp)v=\delta+\rho+\tilde{\gamma}(p^{\max}-p) and v=δ+2ρv=\delta+2\rho, respectively. Thus, the first option is less expensive than the second if γ~(pmaxp)<ρ\tilde{\gamma}(p^{\max}-p)<\rho. In contrast, if we do not build a pilot plant (x=0x=0), there will be no recourse; in that case, in order to guarantee feasibility for every possible realization of qq, we have to build two plants using the current design, which results in a cost of v=2ρv=2\rho.

Refer to caption
Figure 4: Decision diagram for illustrative pilot plant example.

The problem further simplifies since the objective is to minimize the worst-case cost and the worst case is already known a priori, namely p=pminp=p^{\min}, q=qminq=q^{\min}, and γ=ρ+γ~(pmaxpmin)\gamma=\rho+\tilde{\gamma}(p^{\max}-p^{\min}). Therefore, the optimal value of (22) is simply

v=min{2ρ,δ+ρ+γ~(pmaxpmin)},v^{*}=\min\left\{2\rho,\,\delta+\rho+\tilde{\gamma}(p^{\max}-p^{\min})\right\},

and the corresponding optimal solution is as follows: If ρ<δ+γ~(pmaxpmin)\rho<\delta+\tilde{\gamma}(p^{\max}-p^{\min}), x=y=0x^{*}=y^{*}=0 and z1=z2=1z_{1}^{*}=z_{2}^{*}=1; if ρ>δ+γ~(pmaxpmin)\rho>\delta+\tilde{\gamma}(p^{\max}-p^{\min}), x=y=1x^{*}=y^{*}=1 and z1=z2=0z_{1}^{*}=z_{2}^{*}=0. Both solutions are optimal if ρ=δ+γ~(pmaxpmin)\rho=\delta+\tilde{\gamma}(p^{\max}-p^{\min}).

For our computational case study, we choose ρ=100\rho=100, d=59d=59, pmin=5p^{\min}=5, pmax=10p^{\max}=10, qmin=35q^{\min}=35, qmax=75q^{\max}=75, α=10\alpha=10, β=6\beta=6, and Δ=5\Delta=5. The heat map in Figure 5a shows vv^{*} as a function of δ\delta and γ~\tilde{\gamma}. The red line is given by δ=ρ(pmaxpmin)γ~=1005γ~\delta=\rho-(p^{\max}-p^{\min})\tilde{\gamma}=100-5\tilde{\gamma}, which indicates the boundary at which vv^{*} switches from δ+ρ+γ~(pmaxpmin)\delta+\rho+\tilde{\gamma}(p^{\max}-p^{\min}) to 2ρ2\rho.

Refer to caption

(a) Worst-case cost

Refer to caption

(b) Approximate expected cost
Figure 5: The heat maps show the (a) worst-case cost and (b) scenario-based approximation of the expected cost as functions of δ\delta and γ~\tilde{\gamma}.

7.1.2 Non-Worst-Case Objective Function

As mentioned above, in this example, recourse is only possible if x=1x=1, which essentially means that this decision determines whether the problem is static or two-stage. However, note that recourse is not required in the optimal solution of (22), which is due to the fact that worst-case cost minimization is considered and that the worst case is known a priori. In the following, we consider an alternative, scenario-based objective function, which is often a better choice in practice. Here, we introduce a discrete scenario set 𝒮\mathcal{S}, which contains possible realizations of pp, each denoted by p¯s\bar{p}_{s} with probability ϕs\phi_{s}, where s𝒮ϕs=1\sum_{s\in\mathcal{S}}\phi_{s}=1. We then replace the objective function in problem (22) with

δx+s𝒮ϕs(γ¯sy(xp¯s)+ρ[z1(xp¯s)+z2(xp¯s)])\delta x+\sum_{s\in\mathcal{S}}\phi_{s}\left(\bar{\gamma}_{s}y(x\bar{p}_{s})+\rho\left[z_{1}(x\bar{p}_{s})+z_{2}(x\bar{p}_{s})\right]\right) (23)

with γ¯s=ρ+γ~(pmaxp¯s)\bar{\gamma}_{s}=\rho+\tilde{\gamma}(p^{\max}-\bar{p}_{s}). This objective function resembles a sample average approximation of the expected cost as commonly used in stochastic programming. We further reformulate the constraints to eliminate random recourse and remove γ\gamma from the uncertainty set, arriving at the following formulation:

minimizex,y,z1,z2,w1,w2\displaystyle\operatorname*{minimize}_{x,y,z_{1},z_{2},w_{1},w_{2}} δx+s𝒮ϕs(γ¯sy(xp¯s)+ρ[z1(xp¯s)+z2(xp¯s)])\displaystyle\delta x+\sum_{s\in\mathcal{S}}\phi_{s}\left(\bar{\gamma}_{s}y(x\bar{p}_{s})+\rho\left[z_{1}(x\bar{p}_{s})+z_{2}(x\bar{p}_{s})\right]\right) (24)
subjectto\displaystyle\mathrm{subject\;to} x{0,1}\displaystyle x\in\{0,1\}
y(xp)xw1(xp)qw1(xp)qmaxz1(xp)w2(xp)qw2(xp)qmaxz2(xp)dw1(xp)+w2(xp)+dy(xp)y(xp),z1(xp),z2(xp){0,1}w1(xp),w2(xp)+}(p,q)Ξ¯(x)\displaystyle\left.\begin{array}[]{l}y(xp)\leq x\\[4.0pt] w_{1}(xp)\leq q\\[4.0pt] w_{1}(xp)\leq q^{\max}z_{1}(xp)\\[4.0pt] w_{2}(xp)\leq q\\[4.0pt] w_{2}(xp)\leq q^{\max}z_{2}(xp)\\[4.0pt] d\leq w_{1}(xp)+w_{2}(xp)+dy(xp)\\[4.0pt] y(xp),\,z_{1}(xp),\,z_{2}(xp)\in\{0,1\}\\ w_{1}(xp),\,w_{2}(xp)\in\mathbb{R}_{+}\end{array}\right\}\quad\forall\,(p,q)\in\overline{\Xi}(x)

with

Ξ¯(x)={(p,q)+2:pminxppmaxxqmin(1x)+(αΔ)x+βpqqmax(1x)+(α+Δ)x+βp}.\small\overline{\Xi}(x)=\left\{(p,q)\in\mathbb{R}_{+}^{2}:\begin{array}[]{l}p^{\min}x\leq p\leq p^{\max}x\\[4.0pt] q^{\min}(1-x)+(\alpha-\Delta)x+\beta p\leq q\leq q^{\max}(1-x)+(\alpha+\Delta)x+\beta p\end{array}\right\}.

Note that we have introduced two new continuous recourse variables, w1w_{1} and w2w_{2}, which represent the production amounts of plants built with the current design.

The optimal solution to this modified problem is less trivial as now recourse decisions, as depicted in Figure 4, play a crucial role. Still, we can solve it analytically. If we choose to build a pilot plant, the optimal recourse decisions are just as described in Section 7.1.1. As a result, the optimal value is

v=min{2ρ,δ+s𝒮1ϕsρ+s𝒮2ϕs[ρ+γ~(pmaxp¯s)]+s𝒮32ϕsρ},v^{*}=\min\left\{2\rho,\,\delta+\sum_{s\in\mathcal{S}_{1}}\phi_{s}\rho+\sum_{s\in\mathcal{S}_{2}}\phi_{s}[\rho+\tilde{\gamma}(p^{\max}-\bar{p}_{s})]+\sum_{s\in\mathcal{S}_{3}}2\phi_{s}\rho\right\},

where 𝒮1:={s𝒮:p¯sp^}\mathcal{S}_{1}:=\{s\in\mathcal{S}:\bar{p}_{s}\geq\hat{p}\}, 𝒮2:={s𝒮:p¯s<p^,γ~(pmaxp¯s)<ρ}\mathcal{S}_{2}:=\{s\in\mathcal{S}:\bar{p}_{s}<\hat{p},\,\tilde{\gamma}(p^{\max}-\bar{p}_{s})<\rho\}, and 𝒮3:={s𝒮:p¯s<p^,γ~(pmaxp¯s)ρ}\mathcal{S}_{3}:=\{s\in\mathcal{S}:\bar{p}_{s}<\hat{p},\,\tilde{\gamma}(p^{\max}-\bar{p}_{s})\geq\rho\} such that 𝒮=𝒮1𝒮2𝒮3\mathcal{S}=\mathcal{S}_{1}\cup\mathcal{S}_{2}\cup\mathcal{S}_{3}.

We can also apply the proposed decision rule approach to solve this problem. To obtain the exact optimal solution, the marginal support of pp needs to be partitioned using appropriate breakpoints. Depending on the parameter values, we require at least one or two breakpoints. If γ~(pmaxpmin)ρ\tilde{\gamma}(p^{\max}-p^{\min})\leq\rho or γ~(pmaxp^)ρ\tilde{\gamma}(p^{\max}-\hat{p})\geq\rho, we only need one breakpoint at p^\hat{p}. If γ~(pmaxpmin)>ρ\tilde{\gamma}(p^{\max}-p^{\min})>\rho and γ~(pmaxp^)<ρ\tilde{\gamma}(p^{\max}-\hat{p})<\rho, we need two breakpoints, one at pmaxρ/γ~p^{\max}-\rho/\tilde{\gamma} and one at p^\hat{p}. Note that the same decision rules are used across all scenarios in 𝒮\mathcal{S}; as a result, compared to the worst-case formulation, no additional variables are added to the model such that a comparable computational complexity is achieved. We solve the problem using 100 scenarios sampled from a uniform distribution supported on the given uncertainty set, and obtain the heat map shown in Figure 5b.

7.2 Maintenance Planning with Inspections

This example involves type-1, type-2a, and type-2b endogenous uncertainty. Here, we consider integrated production and maintenance planning with equipment degradation. We use the remaining useful life (RUL) as a measure of equipment health, which decreases with continued operation. To avoid equipment failure, the RUL has to remain greater than zero and can be restored through maintenance. Investing in an equipment upgrade, which would reduce the degradation rate, is also considered. However, degradation processes are inherently stochastic, which renders the change in RUL an uncertain parameter. Furthermore, RUL is an abstract quantity that is often difficult to estimate from operational process data such that elaborate inspections are required to obtain reliable equipment health information. Therefore, in this example, we assume that the RUL can only be observed through inspections, and formulate the resulting multistage robust optimization problem as follows:

minimizex,𝒚,𝒑,𝒔,𝒓,𝒎,𝒇,𝒛\displaystyle\operatorname*{minimize}_{x,\bm{y},\bm{p},\bm{s},\bm{r},\bm{m},\bm{f},\bm{z}} αx+max(𝝃,𝝃¯,𝜻)ΘT(x,𝒚,𝒛)t𝒯(βyt+γpt+δzt+ρmtφft)\displaystyle\alpha x+\max_{(\bm{\xi},\bm{\bar{\xi}},\bm{\zeta})\in\Theta_{T}(x,\bm{y},\bm{z})}\sum_{t\in\mathcal{T}}\left(\beta y_{t}+\gamma p_{t}+\delta z_{t}+\rho m_{t}-\varphi f_{t}\right) (25)
subjectto\displaystyle\mathrm{subject\;to} x{0,1}\displaystyle x\in\{0,1\}
st=s0+t=1t(ptdt)smaxptCytrt=r0t=1tξt+t=1tftrmaxftrmaxmtyt+mt1mtzt1pt,st,rt,ft+,yt,zt,mt{0,1}}t𝒯,(𝝃[t],𝝃¯[t],𝜻[t])Θt(x,𝒚[t],𝒛[t]),\displaystyle\left.\begin{array}[]{l}s_{t}=s_{0}+\sum\limits_{t^{\prime}=1}^{t}(p_{t^{\prime}}-d_{t^{\prime}})\leq s^{\max}\\[4.0pt] p_{t}\leq Cy_{t}\\[4.0pt] r_{t}=r_{0}-\sum\limits_{t^{\prime}=1}^{t}\xi_{t^{\prime}}+\sum\limits_{t^{\prime}=1}^{t}f_{t^{\prime}}\leq r^{\max}\\[4.0pt] f_{t}\leq r^{\max}m_{t}\\[4.0pt] y_{t}+m_{t}\leq 1\\[4.0pt] m_{t}\leq z_{t-1}\\[4.0pt] p_{t},\,s_{t},\,r_{t},\,f_{t}\in\mathbb{R}_{+},\quad y_{t},\,z_{t},\,m_{t}\in\{0,1\}\end{array}\right\}\begin{array}[]{l}\forall\,t\in\mathcal{T},\\[4.0pt] (\bm{\xi}_{[t]},\bm{\bar{\xi}}_{[t]},\bm{\zeta}_{[t]})\in\Theta_{t}(x,\bm{y}_{[t]},\bm{z}_{[t]}),\end{array}

where xx equals 1 if an equipment upgrade is performed, yty_{t} equals 1 if the plant operates in time period tt, ptp_{t} is the amount produced in time period tt, sts_{t} is the inventory level at time tt (end of time period tt), rtr_{t} is the RUL at time tt, mtm_{t} equals 1 if maintenance is performed in time period tt, ftf_{t} is the amount of RUL restored in time period tt, and ztz_{t} equals 1 if an inspection is conducted at time tt. The recourse variables are functions of uncertain parameters; however, for the sake of brevity, we do not explicitly indicate this in (25). With given positive cost coefficients α\alpha, β\beta, γ\gamma, δ\delta, ρ\rho, and φ\varphi, the objective is to minimize the worst-case cost over the planning horizon given by the set of time periods 𝒯={1,,T}\mathcal{T}=\{1,\dots,T\}. Note that a negative cost is assigned to ftf_{t} in order to encourage maintenance toward the end of the equipment’s residual life and to retain a reasonable RUL value at the end of the planning horizon. Given an initial inventory s0s_{0} and demand dtd_{t} for every time period tt, the second constraint in (25) expresses material balance and sets an upper bound smaxs^{\max} on the inventory. The next constraint states that the production amount is bounded by the plant capacity CC. Assuming the initial RUL r0r_{0} is known, rtr_{t} is computed and bounded in the next constraint. The RUL reduction caused by operation in time period tt is denoted by ξt\xi_{t}, which is uncertain. If maintenance is performed, the RUL can be restored to its maximum rmaxr^{\max}, as stated in the next inequality. Finally, the last two constraints state that maintenance can only be performed if the plant is shut down and needs to be immediately preceded by an inspection. We further assume that z0=m1=f1=0z_{0}=m_{1}=f_{1}=0.

The decision-dependent uncertainty set, which includes auxiliary uncertain parameters, is

Θt(x,𝒚[t],𝒛[t])={(𝝃[t],𝝃¯[t],𝜻[t])+3t:ξ¯t=t′′=1tξt′′t=1,,tξminytξtξmaxytt=1,,tξtξmaxξΔxt=1,,tξ¯ttξmax(1zt)ζtξ¯tt=1,,tζtξ¯tmaxztt=1,,t},\Theta_{t}(x,\bm{y}_{[t]},\bm{z}_{[t]})=\left\{(\bm{\xi}_{[t]},\bm{\bar{\xi}}_{[t]},\bm{\zeta}_{[t]})\in\mathbb{R}^{3t}_{+}:\begin{array}[]{l}\bar{\xi}_{t^{\prime}}=\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\xi_{t^{\prime\prime}}\quad\forall\,t^{\prime}=1,\dots,t\\[8.0pt] \xi^{\min}y_{t^{\prime}}\leq\xi_{t^{\prime}}\leq\xi^{\max}y_{t^{\prime}}\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \xi_{t^{\prime}}\leq\xi^{\max}-\xi^{\Delta}x\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \bar{\xi}_{t^{\prime}}-t^{\prime}\xi^{\max}(1-z_{t^{\prime}})\leq\zeta_{t^{\prime}}\leq\bar{\xi}_{t^{\prime}}\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \zeta_{t^{\prime}}\leq\bar{\xi}^{\max}_{t^{\prime}}z_{t^{\prime}}\quad\forall\,t^{\prime}=1,\dots,t\end{array}\right\},

where the auxiliary uncertain parameter ξ¯t\bar{\xi}_{t} is the cumulative RUL reduction up to time tt. The reason for introducing ξ¯t\bar{\xi}_{t} is that an inspection at time tt reveals ξ¯t\bar{\xi}_{t} but not each individual ξt\xi_{t^{\prime}} that has contributed to the cumulative RUL reduction since the last inspection. The RUL reduction in time period tt, ξt\xi_{t}, is zero if the plant does not operate, i.e. yt=0y_{t}=0, and bounded by ξmin\xi^{\min} and ξmax\xi^{\max} otherwise. This upper bound is further reduced by ξΔ\xi^{\Delta} if an equipment upgrade is performed, i.e. x=1x=1. We have another auxiliary variable ζt\zeta_{t} for every t𝒯t\in\mathcal{T}, which takes the value of ξ¯t\bar{\xi}_{t} if it is observed through an inspection at time tt, i.e. zt=1z_{t}=1, and is zero otherwise.

Note that 𝒯\mathcal{T} denotes the set of time periods, not the set of stages. In fact, strictly speaking, the number of stages depends on the solution and is directly related to the number of inspections. The maximum possible number of stages is TT, which is only achieved if inspection is performed at every t𝒯t\in\mathcal{T}. In that case, after eliminating the state variables sts_{t} and rtr_{t}, the first-stage decisions are xx, y1y_{1}, p1p_{1}, and z1z_{1}. Then, ξ¯1\bar{\xi}_{1} is observed and the second-stage decisions are y2y_{2}, p2p_{2}, m2m_{2}, f2f_{2}, and z2z_{2}; this continues similarly up to stage TT. The number of stages is implicitly encoded in the recourse variables that are functions of only observed uncertain parameters. Specifically, for all t𝒯t\in\mathcal{T}, yt=yt(𝜻[t1])y_{t}=y_{t}(\bm{\zeta}_{[t-1]}), pt=pt(𝜻[t1])p_{t}=p_{t}(\bm{\zeta}_{[t-1]}), mt=mt(𝜻[t1])m_{t}=m_{t}(\bm{\zeta}_{[t-1]}), ft=ft(𝜻[t1])f_{t}=f_{t}(\bm{\zeta}_{[t-1]}), and zt=zt(𝜻[t1])z_{t}=z_{t}(\bm{\zeta}_{[t-1]}).

We conduct a case study with the following data: T=7T=7, β=$ 2×103\beta=\$\,2\times 10^{3}, γ=$ 400\gamma=\$\,400, δ=$ 103\delta=\$\,10^{3}, ρ=$ 2.5×105\rho=\$\,2.5\times 10^{5}, φ=$ 2×103\varphi=\$\,2\times 10^{3}, s0=10s_{0}=10, smax=50s^{\max}=50, C=90C=90, r0=40r_{0}=40, rmax=125r^{\max}=125, dt=50d_{t}=50 for all t𝒯t\in\mathcal{T}, ξmin=4\xi^{\min}=4, ξmax=25\xi^{\max}=25, and ξΔ=13\xi^{\Delta}=13. Two cases are considered: Case 1, in which the equipment upgrade cost α\alpha is $ 5×104\$\,5\times 10^{4}, and Case 2 with α\alpha being $ 104\$\,10^{4}. For each case, we solve one instance using the above ARO model with one breakpoint placed at 8 for each ξt\xi_{t}, and another instance that disallows recourse, resulting in a static problem (SRO). The optimal values and solution times for all four instances are shown in Table 2. In both Cases 1 and 2, one can observe a clear benefit from recourse, which is reflected in an improvement in the optimal value of about 3 %. However, the computational cost for solving the ARO model is significantly higher due to the larger model size. Specifically, the ARO model has 218 binary variables, 320,150 continuous variables, and 952,374 constraints after Gurobi’s preprocessing.

Table 2: Computational results for instances of the maintenance planning problem.
Cost of upgrade (k$) Model Optimal value (k$) Solution time (s)
Case 1 50 SRO 219 425
ARO 212 13,621
Case 2 10 SRO 213 382
ARO 206 7,837

The results in Table 2 also indicate that an equipment upgrade can be beneficial if the cost is sufficiently low as the solution suggests investing in an equipment upgrade in Case 2 but not in Case 1. Figure 6 further shows how the equipment upgrade affects the optimal schedule. As depicted in Figure 6a, without equipment upgrade, inspection is performed at time 1. Then, depending on the realization of ξ¯1\bar{\xi}_{1}, the top or bottom schedule is chosen. If ξ¯18\bar{\xi}_{1}\geq 8, maintenance is performed in time period 2, immediately after the inspection; otherwise, maintenance is postponed to time period 3. In contrast, the solution for Case 2, as shown in Figure 6b, suggests inspecting the equipment at time 2. This later time point is made possible by the equipment upgrade that reduces the maximum degradation rate. Then, if ξ¯216\bar{\xi}_{2}\geq 16, maintenance is performed in time period 3; otherwise, maintenance is scheduled for time period 6.

Refer to caption

(a) ARO solution for Case 1

Refer to caption

(b) ARO solution for Case 2
Figure 6: The Gantt charts show the schedules obtained from solving the ARO model for (a) Case 1 with high and (b) Case 2 with low cost of equipment upgrade.

7.3 Optimizing Revision Points in Capacity Planning

The previous example considers a problem in which the number of stages is affected by decisions due to the decision-dependent observation of uncertain parameters. There are other situations in which uncertain parameters are observed over time, independent of decisions, yet we still would like to restrict or optimize the number of stages. One example is long-term capacity planning, which involves large investments that require commitments and lead time for establishing contracts and infrastructure. Hence, it is often infeasible to change decisions frequently despite new information being constantly revealed over time. Instead, one has to restrict oneself to a few revision points at which the planning decisions are revised and updated. In their recent work, Basciftci et al. 58 have formally introduced an adaptive two-stage stochastic programming approach in which for each decision policy, one revision point is considered and chosen as part of the optimization problem. This problem can be viewed as a stochastic program with type-2b endogenous uncertainty where planning decisions for time periods before the revision point have to be made here and now while decisions after the revision point can be adjusted based on the uncertainty revealed up to the revision point. As a result, our framework can be directly applied to solve a robust optimization variant of the problem and extensions that consider multiple revision points.

Consider a simplified power generation capacity expansion problem with TT time periods and a set of types of generation resources \mathcal{I}. At the beginning of each time period tt (i.e. at time t1t-1), the uncertain electricity demand for the same time period, dtd_{t}, is revealed and the amount of electricity generated from each generation type ii in time period tt, yity_{it}, is determined. Capacity expansion decisions, however, can only be made or updated at revision points. Let xitx_{it} denote the amount of generation capacity of type ii that is added in time period tt and becomes available at time tt. Assume that for each generation type ii, there are two revision points, one at time ti1t^{\prime}_{i}-1, immediately after dtid_{t^{\prime}_{i}} is revealed, and the other one at time ti′′1t^{\prime\prime}_{i}-1 with ti<ti′′t^{\prime}_{i}<t^{\prime\prime}_{i}. Then, all xitx_{it} with t<tit<t^{\prime}_{i} have to be chosen before any demand for the planning horizon is known, all xitx_{it} with tit<ti′′t^{\prime}_{i}\leq t<t^{\prime\prime}_{i} are chosen at time ti1t^{\prime}_{i}-1 with known 𝒅[ti]\bm{d}_{[t^{\prime}_{i}]}, and all xitx_{it} with tti′′t\geq t^{\prime\prime}_{i} are chosen at time ti′′1t^{\prime\prime}_{i}-1 with known 𝒅[ti′′]\bm{d}_{[t^{\prime\prime}_{i}]}. Given a maximum number of revision points for each generation type ii, denoted by nin_{i}, we aim to place the revision points such that a given cost function, in this case a sample average approximation of the expected cost, is minimized.

We introduce binary variables z^it\hat{z}_{it}, which equals 1 if capacity expansion decisions for generation type ii are revised at time t1t-1, and zittz_{itt^{\prime}}, which equals 1 if dtd_{t^{\prime}} is known when the decision xitx_{it} is made. The problem can then be formulated as follows:

minimize𝒛,𝒛^,𝒙,𝒚\displaystyle\operatorname*{minimize}_{\bm{z},\bm{\hat{z}},\bm{x},\bm{y}}\quad iαixi0+s𝒮ϕst𝒯1(1+r)t[i(αix¯its+βiy¯its)]\displaystyle\sum_{i\in\mathcal{I}}\alpha_{i}x_{i0}+\sum_{s\in\mathcal{S}}\phi_{s}\sum_{t\in\mathcal{T}}\frac{1}{(1+r)^{t}}\left[\sum_{i\in\mathcal{I}}\left(\alpha_{i}\bar{x}_{its}+\beta_{i}\bar{y}_{its}\right)\right] (26a)
subjectto\displaystyle\mathrm{subject\;to}\quad\;\; t𝒯z^itnii\displaystyle\sum_{t\in\mathcal{T}}\hat{z}_{it}\leq n_{i}\quad\forall\,i\in\mathcal{I} (26b)
1nit′′=ttz^it′′zittt′′=ttz^it′′i,t𝒯,t=1,,t\displaystyle\frac{1}{n_{i}}\sum_{t^{\prime\prime}=t^{\prime}}^{t}\hat{z}_{it^{\prime\prime}}\leq z_{itt^{\prime}}\leq\sum_{t^{\prime\prime}=t^{\prime}}^{t}\hat{z}_{it^{\prime\prime}}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t (26c)
z^it{0,1}i,t𝒯\displaystyle\hat{z}_{it}\in\{0,1\}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T} (26d)
zitt{0,1}i,t𝒯,t=1,,t\displaystyle z_{itt^{\prime}}\in\{0,1\}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t (26e)
xitCiti,t𝒯\displaystyle x_{it}\leq C_{it}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T} (26f)
yitt=1txi,t1i,t𝒯,(𝒅[t],𝜻[t])Θt(𝒛[t])\displaystyle y_{it}\leq\sum\limits_{t^{\prime}=1}^{t}x_{i,t^{\prime}-1}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,(\bm{d}_{[t]},\bm{\zeta}_{[t]})\in\Theta_{t}(\bm{z}_{[t]}) (26g)
iyitdtt𝒯,(𝒅[t],𝜻[t])Θt(𝒛[t])\displaystyle\sum\limits_{i\in\mathcal{I}}y_{it}\geq d_{t}\quad\forall\,t\in\mathcal{T},\,(\bm{d}_{[t]},\bm{\zeta}_{[t]})\in\Theta_{t}(\bm{z}_{[t]}) (26h)
xi0+i\displaystyle x_{i0}\in\mathbb{R}_{+}\quad\forall\,i\in\mathcal{I} (26i)
yit,xit+i,t𝒯,(𝒅[t],𝜻[t])Θt(𝒛[t]),\displaystyle y_{it},\,x_{it}\in\mathbb{R}_{+}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,(\bm{d}_{[t]},\bm{\zeta}_{[t]})\in\Theta_{t}(\bm{z}_{[t]}), (26j)

where 𝒯={1,,T}\mathcal{T}=\{1,\dots,T\}, 𝒮\mathcal{S} denotes the set of scenarios, αi\alpha_{i} and βi\beta_{i} are cost coefficients, and rr is a discount factor. Each scenario s𝒮s\in\mathcal{S} is characterized by a specific electricity demand profile 𝒅¯[T],s\bm{\bar{d}}_{[T],s} and probability ϕs\phi_{s}. In the objective function (26a), x¯its=xit(𝒛it𝒅¯[t],s)\bar{x}_{its}=x_{it}(\bm{z}_{it}\circ\bm{\bar{d}}_{[t],s}) and y¯its=yit(𝒅¯[t],s)\bar{y}_{its}=y_{it}(\bm{\bar{d}}_{[t],s}) represent the recourse decisions for specific scenarios. Constraints (26b) restrict the number of revision points. Inequalities (26c) force zittz_{itt^{\prime}} to be 1 if there is at least one revision point between times t1t^{\prime}-1 and t1t-1 and 0 otherwise. The rationale is that if there is one or multiple revision points between t1t^{\prime}-1 and t1t-1, then xitx_{it} will be determined at the latest revision point within that time frame at which point dtd_{t^{\prime}} will be known. Constraints (26f) set an upper bound, CitC_{it}, on each xitx_{it}. Capacity constraints are given in (26g), and constraints (26h) state that the total amount of electricity produced has to meet or exceed the demand. The decision-dependent uncertainty set is

Θt(𝒛[t])={(𝒅[t],𝜻[t])+Kt:d1mind1d1maxδtmindt1dtδtmaxdt1t=2,,tt′′=1tdt′′t′′=1tδ¯t′′1d1t=1,,tdt′′dt′′max(1zitt′′)ζitt′′dt′′i,t=1,,t,t′′=1,,tdt′′minzitt′′ζitt′′dt′′maxzitt′′i,t=1,,t,t′′=1,,t},\small\Theta_{t}(\bm{z}_{[t]})=\left\{(\bm{d}_{[t]},\bm{\zeta}_{[t]})\in\mathbb{R}_{+}^{K_{t}}:\begin{array}[]{l}d_{1}^{\min}\leq d_{1}\leq d_{1}^{\max}\\[4.0pt] \delta_{t^{\prime}}^{\min}d_{t^{\prime}-1}\leq d_{t^{\prime}}\leq\delta_{t^{\prime}}^{\max}d_{t^{\prime}-1}\quad\forall\,t^{\prime}=2,\dots,t\\ \sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}d_{t^{\prime\prime}}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\bar{\delta}^{t^{\prime\prime}-1}d_{1}\quad\forall\,t^{\prime}=1,\dots,t\\[8.0pt] d_{t^{\prime\prime}}-d^{\max}_{t^{\prime\prime}}(1-z_{it^{\prime}t^{\prime\prime}})\leq\zeta_{it^{\prime}t^{\prime\prime}}\leq d_{t^{\prime\prime}}\quad\forall\,i\in\mathcal{I},\,t^{\prime}=1,\dots,t,\,t^{\prime\prime}=1,\dots,t^{\prime}\\[4.0pt] d^{\min}_{t^{\prime\prime}}z_{it^{\prime}t^{\prime\prime}}\leq\zeta_{it^{\prime}t^{\prime\prime}}\leq d^{\max}_{t^{\prime\prime}}z_{it^{\prime}t^{\prime\prime}}\quad\forall\,i\in\mathcal{I},\,t^{\prime}=1,\dots,t,\,t^{\prime\prime}=1,\dots,t^{\prime}\end{array}\right\},

which encodes the correlation between the electricity demands in different time periods as the range in which dtd_{t^{\prime}} can vary depends on dt1d_{t^{\prime}-1} and the budget of uncertainty imposed on t′′=1tdt′′\sum_{t^{\prime\prime}=1}^{t^{\prime}}d_{t^{\prime\prime}} is t′′=1tδ¯t′′1d1\sum_{t^{\prime\prime}=1}^{t^{\prime}}\bar{\delta}^{t^{\prime\prime}-1}d_{1}, which is a function of d1d_{1} with a given parameter δ¯\bar{\delta}. The auxiliary uncertain parameters 𝜻[t]\bm{\zeta}_{[t]} depend on the choice of revision points and hence on 𝒛[t]\bm{z}_{[t]}. Note that the vector 𝒛[t]\bm{z}_{[t]} contains all zitt′′z_{it^{\prime}t^{\prime\prime}} with ii\in\mathcal{I}, t=1,,tt^{\prime}=1,\dots,t, and t′′=1,,tt^{\prime\prime}=1,\dots,t^{\prime}. Also, Kt=t+||t(t+1)/2K_{t}=t+|\mathcal{I}|t(t+1)/2. The recourse variables are adjustable given the following dependence on uncertain parameters: xit=xit(𝜻[t])x_{it}=x_{it}(\bm{\zeta}_{[t]}) and yit=yit(𝒅[t])y_{it}=y_{it}(\bm{d}_{[t]}).

Notice that 𝒛\bm{z} and 𝒛^\bm{\hat{z}}, which determine the revision points, are all first-stage variables. As a result, the decision-dependent nonanticipativity for the capacity expansion decisions 𝒙\bm{x} can alternatively be modeled by forcing the decision rule coefficients associated with unobserved uncertain parameters to zero (see Section 6.1). The decision rules with respect to the lifted uncertain parameters are

xit=t=0t(𝒙¯itt𝒅¯t+𝒙^itt𝒅^t)i,t𝒯x_{it}=\sum_{t^{\prime}=0}^{t}\left(\bm{\bar{x}}_{itt^{\prime}}^{\top}\bm{\bar{d}}_{t^{\prime}}+\bm{\hat{x}}_{itt^{\prime}}^{\top}\bm{\hat{d}}_{t^{\prime}}\right)\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T}

with d¯0=1\bar{d}_{0}=1 and d^0=0\hat{d}_{0}=0. Decision-dependent nonanticipativity is then enforced by the following constraints:

Mzitt𝒆𝒙¯ittMzitt𝒆i,t𝒯,t=1,,t\displaystyle-Mz_{itt^{\prime}}\bm{e}\leq\bm{\bar{x}}_{itt^{\prime}}\leq Mz_{itt^{\prime}}\bm{e}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t
Mzitt𝒆𝒙^ittMzitt𝒆i,t𝒯,t=1,,t\displaystyle-Mz_{itt^{\prime}}\bm{e}\leq\bm{\hat{x}}_{itt^{\prime}}\leq Mz_{itt^{\prime}}\bm{e}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t

with MM being a sufficiently large parameter. This alternative approach allows us to work with a fixed uncertainty set:

Θt={𝒅[t]+t:d1mind1d1maxδtmindt1dtδtmaxdt1t=2,,tt′′=1tdt′′t′′=1tδ¯t′′1d1t=1,,t}.\Theta_{t}=\left\{\bm{d}_{[t]}\in\mathbb{R}_{+}^{t}:\begin{array}[]{l}d_{1}^{\min}\leq d_{1}\leq d_{1}^{\max}\\[4.0pt] \delta_{t^{\prime}}^{\min}d_{t^{\prime}-1}\leq d_{t^{\prime}}\leq\delta_{t^{\prime}}^{\max}d_{t^{\prime}-1}\quad\forall\,t^{\prime}=2,\dots,t\\ \sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}d_{t^{\prime\prime}}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\bar{\delta}^{t^{\prime\prime}-1}d_{1}\quad\forall\,t^{\prime}=1,\dots,t\end{array}\right\}.

Moreover, it does not require auxiliary uncertain parameters, which significantly reduces the computational complexity.

In the following case study, we consider a planning horizon of seven time periods and four generation types, namely coal, natural gas-gas turbine (NG-GT), solar PV, and wind. Cost data are adapted from Min et al. 59, which, along with all other required data, can be found in the supplementary material. Furthermore, the objective function is constructed using 100 scenarios sampled from a uniform distribution supported on the given uncertainty set, and uncertainty lifting is performed with two equidistantly placed breakpoints for each uncertain parameter.

We solve the capacity planning problem for different maximum allowed number of revision points n¯\bar{n}, which is applied to all generation types, i.e. ni=n¯n_{i}=\bar{n} for all ii\in\mathcal{I}. The results are shown in Table 3. As expected, the optimal value decreases with increasing number of revision points. The improvement is most significant from the static case (no revision) to the case with one revision point. Diminishing returns are observed at higher number of revision points. The results show that the locations of the revision points for different generation types are generally not the same. For larger n¯\bar{n}, there also seems to be no need to use the maximum number of revision points for all generation types. Finally, note that the solution times are moderate, indicating the potential to solve more complex problems of this kind using the proposed approach.

Table 3: Computational results for the capacity planning problem with different allowed numbers of revision points. The relative improvement is computed with respect to the optimal value of the case with no revision.
Allowed # of revision points Optimal value (k$) Solution time (s) Relative improvement Location of revision points
Coal NG-GT Solar PV Wind
0 46,183 9
1 40,852 427 11.54 % 3 5 1 2
2 39,526 1,679 14.42 % 3, 6 2, 5 1, 6 1, 3
3 38,955 764 15.65 % 2, 3, 6 1, 2, 5 1, 5, 6 1, 3, 5
4 38,612 874 16.39 % 2, 3, 5, 6 1, 2, 3, 5 1, 5, 6 1, 3, 5
5 38,423 592 16.80 % 1, 2, 3, 5, 6 1, 2, 3, 4, 5 5, 6 1, 2, 3, 5
6 38,423 462 16.80 % 1, 2, 3, 4, 5, 6 1, 2, 3, 4, 5 5, 6 1, 2, 3, 5

7.4 Production Scheduling with Active Parameter Estimation

When operating a manufacturing process, we often do not have accurate information across the whole range of feasible operation due to the lack of process data. A model that we use for process optimization may only be accurate around one or a few nominal points, especially if it is a relatively new process. For example, a production unit may be able to operate in different operating modes, but process parameters such as efficiency and processing time are only accurately known for the one operating mode that is used most of the time. The models of other, potentially better-performing operating modes can only be improved by actually operating in those modes and collecting more data; however, this exploration action comes with the risk of operating under conditions with inferior performance or damaging impact. In the following example, we consider process scheduling with active model parameter estimation, which integrates scheduling optimization and design of experiments in order to find an optimal trade-off between exploration and exploitation that maximizes the overall plant performance over time. The problem is formulated as follows:

minimize𝒙,𝒔\displaystyle\operatorname*{minimize}_{\bm{x},\bm{s}} max𝒑𝒫T(𝒙)t𝒯ij𝒥i𝒄𝒖ijxtij(𝒑[t])+δst(𝒑[t])\displaystyle\max_{\bm{p}\in\mathcal{P}_{T}(\bm{x})}\sum_{t\in\mathcal{T}}\sum_{i\in\mathcal{I}}\sum_{j\in\mathcal{J}_{i}}\bm{c}^{\top}\bm{u}_{ij}\,x_{tij}(\bm{p}_{[t]})+\delta s_{t}(\bm{p}_{[t]}) (27)
subjectto\displaystyle\mathrm{subject\;to} st(𝒑[t])=s0+t=1t(ij𝒥iptijdt)smaxt𝒯,𝒑[t]𝒫t(𝒙[t])\displaystyle s_{t}(\bm{p}_{[t]})=s_{0}+\sum_{t^{\prime}=1}^{t}\left(\sum_{i\in\mathcal{I}}\sum_{j\in\mathcal{J}_{i}}p_{t^{\prime}ij}-d_{t^{\prime}}\right)\leq s^{\max}\quad\forall\,t\in\mathcal{T},\,\bm{p}_{[t]}\in\mathcal{P}_{t}(\bm{x}_{[t]})
ij𝒥ixtij(𝒑[t])1t𝒯,𝒑[t]𝒫t(𝒙[t])\displaystyle\sum_{i\in\mathcal{I}}\sum_{j\in\mathcal{J}_{i}}x_{tij}(\bm{p}_{[t]})\leq 1\quad\forall\,t\in\mathcal{T},\,\bm{p}_{[t]}\in\mathcal{P}_{t}(\bm{x}_{[t]})
st(𝒑[t])+t𝒯,𝒑[t]𝒫t(𝒙[t])\displaystyle s_{t}(\bm{p}_{[t]})\in\mathbb{R}_{+}\quad\forall\,t\in\mathcal{T},\,\bm{p}_{[t]}\in\mathcal{P}_{t}(\bm{x}_{[t]})
xtij(𝒑[t]){0,1}t𝒯,i,j𝒥i,𝒑[t]𝒫t(𝒙[t]),\displaystyle x_{tij}(\bm{p}_{[t]})\in\{0,1\}\quad\forall\,t\in\mathcal{T},\,i\in\mathcal{I},\,j\in\mathcal{J}_{i},\,\bm{p}_{[t]}\in\mathcal{P}_{t}(\bm{x}_{[t]}),

where we assume that the plant operating point is chosen among a discrete set of operating points, 𝒥i\mathcal{J}_{i}, which depends on operating mode ii. Each operating point jj of mode ii is defined by a vector of inputs, 𝒖ij\bm{u}_{ij}. The binary variable xtijx_{tij} equals 1 if the plant operates at point jj of mode ii in time period tt. The amount produced, ptijp_{tij}, is treated as an uncertain parameter since the model parameters for the corresponding mode ii may be uncertain. Given cost coefficients 𝒄\bm{c} and δ\delta, the objective is to minimize the worst-case cost over the entire scheduling horizon 𝒯={1,,T}\mathcal{T}=\{1,\dots,T\}. With an initial inventory s0s_{0}, the inventory level at time tt, sts_{t}, is bounded between zero and smaxs^{\max}. The product demand in time period tt is denoted by dtd_{t}.

We assume that the production amount in each operating mode ii is given by a linear function p=ai+𝒃i𝒖p=a_{i}+\bm{b}_{i}^{\top}\bm{u}. For some modes, the model parameters aia_{i} and 𝒃i\bm{b}_{i} are unknown at the beginning of the scheduling horizon, and they can only be estimated by operating in those modes and observing the resulting production amounts. This is captured in the following decision-dependent uncertainty set:

𝒫t(𝒙[t])={𝒑[t]+Kt:ai,𝒃iisuch that0ptijpimaxxtijt=1,,t,i,j𝒥iai+𝒃i𝒖ijpimax(1xtij)ptijai+𝒃i𝒖ijt=1,,t,i,j𝒥iaiminaiaimaxi𝒃imin𝒃i𝒃imaxi}\small\mathcal{P}_{t}(\bm{x}_{[t]})=\left\{\bm{p}_{[t]}\in\mathbb{R}_{+}^{K_{t}}:\begin{array}[]{l}\exists\,a_{i},\,\bm{b}_{i}\;\forall\,i\in\mathcal{I}\quad\text{such that}\\[4.0pt] 0\leq p_{t^{\prime}ij}\leq p^{\max}_{i}x_{t^{\prime}ij}\quad\forall\,t^{\prime}=1,\dots,t,\,i\in\mathcal{I},\,j\in\mathcal{J}_{i}\\[4.0pt] a_{i}+\bm{b}_{i}^{\top}\bm{u}_{ij}-p_{i}^{\max}(1-x_{t^{\prime}ij})\leq p_{t^{\prime}ij}\leq a_{i}+\bm{b}_{i}^{\top}\bm{u}_{ij}\quad\forall\,t^{\prime}=1,\dots,t,\,i\in\mathcal{I},\,j\in\mathcal{J}_{i}\\[4.0pt] a^{\min}_{i}\leq a_{i}\leq a^{\max}_{i}\quad\forall\,i\in\mathcal{I}\\[4.0pt] \bm{b}^{\min}_{i}\leq\bm{b}_{i}\leq\bm{b}^{\max}_{i}\quad\forall\,i\in\mathcal{I}\end{array}\right\}

with Kt=ti|𝒥i|K_{t}=t\sum_{i\in\mathcal{I}}|\mathcal{J}_{i}|. Only if xtij=1x_{tij}=1, the production amount ptijp_{tij} can only take a nonzero value, which depends on the chosen operating point. Each operating point that has been selected up to time period tt activates a linear equation, which can be interpreted as a cut that reduces the size of the uncertainty set. As a result, the linear model of an operating mode is learned through the accumulation of observations, which are the production amounts at different operating points. If 𝒖ijn\bm{u}_{ij}\in\mathbb{R}^{n}, and all model parameters of mode ii are unknown, one needs to operate in mode ii at different operating points at least n+1n+1 times in order to uniquely determine aia_{i} and 𝒃i\bm{b}_{i}.

We use a small instance to illustrate the effect of active learning in this problem. The data are as follows: T=4T=4, c=5c=5, δ=2\delta=2, 𝒅=(32.5,15,51,42.5)\bm{d}=(32.5,15,51,42.5), smax=100s^{\max}=100 and s0=0s_{0}=0. We consider two operating modes and two possible operating points per mode, with each operating point j𝒥ij\in\mathcal{J}_{i} defined by a scalar input uiju_{ij}. We have u11=10u_{11}=10, u12=46u_{12}=46, u21=25u_{21}=25, and u22=44u_{22}=44. The values of the parameters describing the uncertainty set are shown in Table 4. Here, we assume that the plant has not been operated in any of the two modes prior to the beginning of the scheduling horizon.

Table 4: Parameters for the description of the uncertainty set in the production scheduling with active parameter estimation problem.
pimaxp^{\max}_{i} aimina^{\min}_{i} aimaxa^{\max}_{i} biminb^{\min}_{i} bimaxb^{\max}_{i}
Mode 1 65.2 5 10 1 1.2
Mode 2 56.2 20 21 0.5 0.8

The optimal adjustable production schedule is shown in Figure 7. The solution suggests to first operate at point 1 of mode 2 and choose the subsequent schedule depending on the realization of p121p_{121}. The schedule at the top is selected if p12136.75p_{121}\geq 36.75, and the bottom schedule is selected if p121<36.75p_{121}<36.75. One can see that in both schedules, the operating points selected in time periods 2 and 3 are the same; the schedules only differ in the operating mode chosen for time period 4. By only observing the production amount at operating point 1 of mode 2, we cannot fully infer the model for mode 2, i.e. determine a2a_{2} and b2b_{2}. However, the observation reduces the uncertainty for mode 2, which allows us to determine in which mode the plant should operate in time period 4.

Refer to caption
Figure 7: Optimal adjustable production schedule with active parameter estimation. The solution indicates that recourse is only required once.

The effect of learning is further illustrated in Figure 8, which shows how the uncertainty set is reduced over time in a particular scenario with p121>36.75p_{121}>36.75, in which case the schedule at the top of Figure 7 is applied. Each of the four subfigures shows the production amount as a function of the input for a specific time period and the mode chosen for that time period. The black solid vertical lines indicate the marginal uncertainty sets of the production amounts at the available operating points before any observations are made, as given by the ranges of possible values for aia_{i} and bib_{i}. In this particular scenario, a production amount of 39 is observed in time period 1 when the plant operates at point 1 of mode 2. Given this observation and a2[20,21]a_{2}\in[20,21], the uncertainty with respect to b2b_{2} is immediately reduced such that the line representing the linear model of mode 2 has to lie within the cyan shaded area shown in Figure 8a. One can see that this results in a substantial reduction of uncertainty with respect to the unobserved operating point 2 of mode 2. The same effect is seen in time period 2 when the plant operates at point 1 of mode 1 (Figure 8b). After operating at point 2 of mode 1 in time period 3, a1a_{1} and b1b_{1} can be uniquely determined given the observations p211p_{211} and p312p_{312}. The linear model of mode 1 is represented by the red solid line in Figure 8c. Finally, the linear model of mode 2 will also be exactly known after operating at point 2 of mode 2 in time period 4, as indicated in Figure 8d.

Refer to caption

(a) Period 1 - mode 2

Refer to caption

(b) Period 2 - mode 1

Refer to caption

(c) Period 3 - mode 1

Refer to caption

(d) Period 4 - mode 2
Figure 8: Reduction of uncertainty by observing materialized production amounts over time for one specific scenario (defined by one specific set of observations).

The optimal value of the problem is 819.2. Despite the simplicity of the illustrative example, the problem is computationally challenging. This is in part due to the large number of variables and constraints resulting from the reformulation, but even more due to the weak relaxation of the MILP. The latter is indicated by a lower bound that only improves very slowly in the branch-and-bound algorithm, despite the incumbent being the optimal solution already. Computational strategies for mitigating these challenges will be explored and discussed in future work.

8 Conclusions

In this work, we have developed an adjustable robust optimization framework that can simultaneously consider three types of endogenous uncertainty: type 1, type 2a, and type 2b, as defined in a new refined classification of endogenous uncertainty that distinguishes between the decision-dependent materialization and observation of uncertain parameters. One compelling insight from this extended view is that active learning, a sequential form of experimental design, can be formulated as an optimization problem under endogenous uncertainty. As a result, the proposed approach provides a means of performing integrated optimization and active learning in a sequential decision-making environment.

Our framework considers decision-dependent polyhedral uncertainty sets, and we have applied a decision rule approach based on the concept of lifted uncertainty that incorporates continuous and binary recourse, including recourse decisions that affect the uncertainty set. Multistage decision-dependent nonanticipativity is modeled using auxiliary uncertain parameters, which preserves the tractability of the decision rule approach. This eventually results in a mixed-integer optimization problem that can be directly solved using off-the-shelf solvers.

The proposed methodology significantly expands our capability to model and solve data-driven optimization problems under uncertainty, particularly those involving active learning. Computational challenges still exist, which we plan to address in our future work. Nonetheless, this approach promises to be especially suited for applications that involve complex constraints, expensive exploration actions, and strong links between different decision stages. To highlight the relevance of these kinds of problems, we have conducted computational case studies covering a variety of applications. The results demonstrate the versatility of the proposed modeling framework as well as the importance of endogenous uncertainty. As these aspects become increasingly relevant in PSE applications, we hope that this work can contribute toward the development of more comprehensive and efficient methods for such problems.

Acknowledgments

Wei Feng gratefully acknowledges the financial support from the China Scholarship Council (CSC) (No. 201906320317).

References

  • Ben-Tal and Nemirovski 1998 A. Ben-Tal and A. Nemirovski. Robust Convex Optimization. Mathematics of Operations Research, 23(4):769–805, 1998.
  • El Ghaoui et al. 1998 L. El Ghaoui, F. Oustry, and H. Lebret. Robust Solutions to Uncertain Semidefinite Programs. SIAM Journal on Optimization, 9(1):33–52, 1998.
  • Ben-Tal et al. 2004 A. Ben-Tal, A. Goryashko, E. Guslitzer, and A. Nemirovski. Adjustable robust solutions of uncertain linear programs. Mathematical Programming, 99(2):351–376, 2004.
  • Georghiou et al. 2019 A. Georghiou, D. Kuhn, and W. Wiesemann. The decision rule approach to optimization under uncertainty: methodology and applications. Computational Management Science, 16(4):545–576, 2019.
  • Bertsimas et al. 2013 D. Bertsimas, E. Litvinov, X. A. Sun, J. Zhao, and T. Zheng. Adaptive robust optimization for the security constrained unit commitment problem. IEEE Transactions on Power Systems, 28(1):52–63, 2013.
  • Zeng and Zhao 2013 B. Zeng and L. Zhao. Solving two-stage robust optimization problems using a column-and-constraint generation method. Operations Research Letters, 41(5):457–461, 2013.
  • Bertsimas and Caramanis 2010 D. Bertsimas and C. Caramanis. Finite adaptability in multistage linear optimization. IEEE Transactions on Automatic Control, 55(12):2751–2766, 2010.
  • Hanasusanto et al. 2015 G. A. Hanasusanto, D. Kuhn, and W. Wiesemann. K-Adaptability in Two-Stage Robust Binary Programming. Operations Research, 63(4):877–891, 2015.
  • Bertsimas and Georghiou 2015 D. Bertsimas and A. Georghiou. Design of Near Optimal Decision Rules in Multistage Adaptive Mixed-Integer Optimization. Operations Research, 63(3):610–627, 2015.
  • Postek and den Hertog 2016 K. Postek and D. den Hertog. Multistage Adjustable Robust Mixed-Integer Optimization via Iterative Splitting of the Uncertainty Set. INFORMS Journal on Computing, 28(3):553–574, 2016.
  • Bertsimas and Georghiou 2018 D. Bertsimas and A. Georghiou. Binary decision rules for multistage adaptive mixed-integer optimization. Mathematical Programming, 167(2):395–433, 2018.
  • Subramanyam et al. 2020 A. Subramanyam, C. E. Gounaris, and W. Wiesemann. K-adaptability in two-stage mixed-integer robust optimization. Mathematical Programming Computation, 12:193–224, 2020.
  • Yanıkoğlu et al. 2019 I. Yanıkoğlu, B. L. Gorissen, and D. den Hertog. A survey of adjustable robust optimization. European Journal of Operational Research, 277(3):799–813, 2019.
  • Lappas and Gounaris 2016 N. H. Lappas and C. E. Gounaris. Multi-Stage Adjustable Robust Optimization for Process Scheduling Under Uncertainty. AIChE Journal, 62(5):1646–1667, 2016.
  • Zhang et al. 2016a Q. Zhang, M. F. Morari, I. E. Grossmann, A. Sundaramoorthy, and J. M. Pinto. An adjustable robust optimization approach to scheduling of continuous industrial processes providing interruptible load. Computers and Chemical Engineering, 86:106–119, 2016a.
  • Shi and You 2016 H. Shi and F. You. A Computational Framework and Solution Algorithms for Two-Stage Adaptive Robust Scheduling of Batch Manufacturing Processes Under Uncertainty. AIChE Journal, 62(3):687–703, 2016.
  • Zhang et al. 2016b Q. Zhang, I. E. Grossmann, and R. M. Lima. On the Relation Between Flexibility Analysis and Robust Optimization for Linear Systems. AIChE Journal, 62(9):3109–2132, 2016b.
  • Grossmann et al. 2014 I. E. Grossmann, B. A. Calfa, and P. Garcia-Herreros. Evolution of concepts and models for quantifying resiliency and flexibility of chemical processes. Computers and Chemical Engineering, 70:22–34, 2014.
  • Yuan et al. 2018 Y. Yuan, Z. Li, and B. Huang. Nonlinear robust optimization for process design. AIChE Journal, 64(2):481–494, 2018.
  • Gong and You 2018 J. Gong and F. You. Resilient design and operations of process systems: Nonlinear adaptive robust optimization model and algorithm for resilience analysis and enhancement. Computers and Chemical Engineering, 116:231–252, 2018.
  • Ning and You 2017 C. Ning and F. You. A Data-Driven Multistage Adaptive Robust Optimization Framework for Planning and Scheduling Under Uncertainty. AIChE Journal, 63(10):4343–4369, 2017.
  • Lappas and Gounaris 2018a N. H. Lappas and C. E. Gounaris. Theoretical and computational comparison of continuous-time process scheduling models for adjustable robust optimization. AIChE Journal, 64(8):3055–3070, 2018a.
  • Feng et al. 2019 W. Feng, Y. Zhang, G. Rong, and Y. Feng. Finite Adaptability in Data-Driven Robust Optimization for Production Scheduling: A Case Study of the Ethylene Plant. Industrial & Engineering Chemistry Research, 58(16):6505–6518, 2019.
  • Rahal et al. 2020 S. Rahal, Z. Li, and D. J. Papageorgiou. Proactive and reactive scheduling of the steelmaking and continuous casting process through adaptive robust optimization. Computers and Chemical Engineering, 133:106658, 2020.
  • Zhang et al. 2017 X. Zhang, M. Kamgarpour, A. Georghiou, P. Goulart, and J. Lygeros. Robust optimal control with adjustable uncertainty sets. Automatica, 75:249–259, 2017.
  • Shang et al. 2019 C. Shang, W. H. Chen, and F. You. Robust constrained model predictive control of irrigation systems based on data-driven uncertainty set constructions. In Proceedings of the American Control Conference, pages 1906–1911. American Automatic Control Council, 2019.
  • Ning and You 2020 C. Ning and F. You. A transformation-proximal bundle algorithm for multistage adaptive robust optimization and application to constrained robust optimal control. Automatica, 113:108802, 2020.
  • Yue and You 2016 D. Yue and F. You. Optimal Supply Chain Design and Operations Under Multi-Scale Uncertainties: Nested Stochastic Robust Optimization Modeling Framework and Solution Algorithm. AIChE Journal, 62(9):3041–3055, 2016.
  • Subramanyam et al. 2018 A. Subramanyam, F. Mufalli, J. M. Pinto, and C. E. Gounaris. Robust Multi-Period Vehicle Routing under Customer Order Uncertainty. Optimization Online, 2018.
  • Jonsbråten et al. 1998 T. W. Jonsbråten, R. J.-B. Wets, and D. L. Woodruff. A class of stochastic programs with decision dependent random elements. Annals of Operations Research, 82:83–106, 1998.
  • Peeta et al. 2010 S. Peeta, F. S. Salman, D. Gunnec, and K. Viswanath. Pre-disaster investment decisions for strengthening a highway network. Computers and Operations Research, 37(10):1708–1719, 2010.
  • Escudero et al. 2018 L. F. Escudero, M. A. Garín, J. F. Monge, and A. Unzueta. On preparedness resource allocation planning for natural disaster relief under endogenous uncertainty with time-consistent risk-averse management. Computers and Operations Research, 98:84–102, 2018.
  • Hellemo et al. 2018 L. Hellemo, P. I. Barton, and A. Tomasgard. Decision-dependent probabilities in stochastic programs with recourse. Computational Management Science, 15(3-4):369–395, 2018.
  • Goel and Grossmann 2004 V. Goel and I. E. Grossmann. A stochastic programming approach to planning of offshore gas field developments under uncertainty in reserves. Computers and Chemical Engineering, 28(8):1409–1429, 2004.
  • Goel and Grossmann 2006 V. Goel and I. E. Grossmann. A class of stochastic programs with decision dependent uncertainty. Mathematical Programming, 108:355–397, 2006.
  • Colvin and Maravelias 2008 M. Colvin and C. T. Maravelias. A stochastic programming approach for clinical trial planning in new drug development. Computers and Chemical Engineering, 32(11):2626–2642, 2008.
  • Colvin and Maravelias 2010 M. Colvin and C. T. Maravelias. Modeling methods and a branch and cut algorithm for pharmaceutical clinical trial planning using stochastic programming. European Journal of Operational Research, 203(1):205–215, 2010.
  • Vayanos et al. 2011 P. Vayanos, D. Kuhn, and B. Rustem. Decision rules for information discovery in multi-stage stochastic programming. In Proceedings of the IEEE Conference on Decision and Control, pages 7368–7373, 2011.
  • Tarhan et al. 2013 B. Tarhan, I. E. Grossmann, and V. Goel. Computational strategies for non-convex multistage MINLP models with decision-dependent uncertainty and gradual uncertainty resolution. Annals of Operations Research, 203(1):141–166, 2013.
  • Gupta and Grossmann 2014 V. Gupta and I. E. Grossmann. A new decomposition algorithm for multistage stochastic programs with endogenous uncertainties. Computers and Chemical Engineering, 62:62–79, 2014.
  • Christian and Cremaschi 2015 B. Christian and S. Cremaschi. Heuristic solution approaches to the pharmaceutical R&D pipeline management problem. Computers and Chemical Engineering, 74:34–47, 2015.
  • Hooshmand and MirHassani 2016 F. Hooshmand and S. A. MirHassani. Efficient constraint reduction in multistage stochastic programming problems with endogenous uncertainty. Optimization Methods and Software, 31(2):359–376, 2016.
  • Apap and Grossmann 2017 R. M. Apap and I. E. Grossmann. Models and computational strategies for multistage stochastic programming under endogenous and exogenous uncertainties. Computers and Chemical Engineering, 103:233–274, 2017.
  • Poss 2013 M. Poss. Robust combinatorial optimization with variable budgeted uncertainty. 4OR, 11(1):75–92, 2013.
  • Nohadani and Sharma 2018 O. Nohadani and K. Sharma. Optimization under decision-dependent uncertainty. SIAM Journal on Optimization, 28(2):1773–1795, 2018.
  • Lappas and Gounaris 2018b N. H. Lappas and C. E. Gounaris. Robust optimization for decision-making under endogenous uncertainty. Computers and Chemical Engineering, 111:252–266, 2018b.
  • Avraamidou and Pistikopoulos 2019 S. Avraamidou and E. N. Pistikopoulos. Adjustable robust optimization through multi-parametric programming. Optimization Letters, 14(4):873–887, 2019.
  • Feng et al. 2020 W. Feng, Y. Feng, and Q. Zhang. Multistage Robust Mixed-Integer Optimization Under Endogenous Uncertainty. arXiv preprint arXiv:2008.11633, 2020.
  • Bertsimas and Vayanos 2014 D. Bertsimas and P. Vayanos. Data-driven learning in dynamic pricing using adaptive optimization. Optimization Online, 2014.
  • Vayanos et al. 2020a P. Vayanos, A. Georghiou, and H. Yu. Robust optimization with decision-dependent information discovery. arXiv preprint arXiv:2004.08490, 2020a.
  • Vayanos et al. 2020b P. Vayanos, D. C. Mcelfresh, Y. Ye, J. P. Dickerson, and E. Rice. Active Preference Elicitation via Adjustable Robust Optimization. arXiv preprint arXiv:2003.01899, 2020b.
  • Shahriari et al. 2016 B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. De Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016.
  • Sutton and Barto 2018 R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. MIT press, 2018.
  • Audibert et al. 2010 J. Y. Audibert, S. Bubeck, and R. Munos. Best arm identification in multi-armed bandits. COLT 2010 - The 23rd Conference on Learning Theory, pages 41–53, 2010.
  • Georghiou et al. 2015 A. Georghiou, W. Wiesemann, and D. Kuhn. Generalized decision rule approximations for stochastic programming via liftings. Mathematical Programming, 152(1-2):301–338, 2015.
  • Gorissen et al. 2015 B. L. Gorissen, I. Yanikoğlu, and D. den Hertog. A practical guide to robust optimization. Omega, 53:124–137, 2015.
  • Lubin and Dunning 2015 M. Lubin and I. Dunning. Computing in Operations Research Using Julia. INFORMS Journal on Computing, 27(2):237–248, 2015.
  • Basciftci et al. 2019 B. Basciftci, S. Ahmed, and N. Gebraeel. Adaptive Two-stage Stochastic Programming with an Application to Capacity Expansion Planning. arXiv preprint arXiv:1906.03513, 2019.
  • Min et al. 2018 D. Min, J.-h. Ryu, and D. G. Choi. A long-term capacity expansion planning model for an electric power system integrating large-size renewable energy technologies. Computers and Operations Research, 96:244–255, 2018.