A Unified Framework for Adjustable Robust Optimization with Endogenous Uncertainty

Qi Zhang Corresponding author ([email protected]) Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, MN 55455, USA Wei Feng State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China

Abstract

This work proposes a framework for multistage adjustable robust optimization that unifies the treatment of three different types of endogenous uncertainty, where decisions, respectively, (i) alter the uncertainty set, (ii) affect the materialization of uncertain parameters, and (iii) determine the time when the true values of uncertain parameters are observed. We provide a systematic analysis of the different types of endogenous uncertainty and highlight the connection between optimization under endogenous uncertainty and active learning. We consider decision-dependent polyhedral uncertainty sets and propose a decision rule approach that incorporates both continuous and binary recourse, including recourse decisions that affect the uncertainty set. The proposed method enables the modeling of decision-dependent nonanticipativity and results in a tractable reformulation of the problem. We demonstrate the effectiveness of the approach in computational experiments that cover a range of applications, including plant redesign, maintenance planning with inspections, optimizing revision points in capacity planning, and production scheduling with active parameter estimation. The results show significant benefits from the proper modeling of endogenous uncertainty and active learning.

Keywords: adjustable robust optimization, endogenous uncertainty, active learning

1 Introduction

Robust optimization is an approach to decision making under uncertainty that has gained tremendous popularity in the optimization community over the last two decades. It is particularly attractive in situations in which (i) the set of possible uncertainty realizations, commonly referred to as the uncertainty set, is known but the probability distribution is not, (ii) feasibility over the entire uncertainty set or the worst-case performance is of main interest, or (iii) alternative approaches, such as scenario-based stochastic programming, are computationally significantly less tractable. Earlier works on robust optimization ¹, ² were limited to the static case, which only considers here-and-now decisions that need to be determined before the uncertainty is realized. In contrast, adjustable robust optimization (ARO) ³ provides a framework that also incorporates wait-and-see (or recourse) decisions, which can be determined after the true values of some uncertain parameters are revealed. This makes ARO suitable for a large set of sequential decision-making problems, hence greatly expanding the applicability of the robust optimization paradigm. There has been a plethora of theoretical and practical contributions addressing a variety of ARO problems. A particular research focus has been the development of tractable approximations, e.g. in the form of decision rules ³, ⁴, and solution methods ⁵, ⁶, with many of the more recent efforts directed at the efficient handling of discrete recourse decisions ⁷, ⁸, ⁹, ¹⁰, ¹¹, ¹². We refer the reader to Yanıkoğlu et al. ¹³ for a comprehensive survey on ARO.

The term “adjustable robust optimization” had not appeared in the process systems engineering (PSE) literature until 2016 ¹⁴, ¹⁵, ¹⁶. Only then, as formally discussed by Zhang et al. ¹⁷, the PSE community realized that it actually has been applying the concept of (two-stage) ARO for decades, but under a different name: flexibility analysis ¹⁸. Many of the approaches developed in this line of work resemble the ones presented in the robust optimization literature. However, traditional flexibility analysis focuses primarily on nonlinear problems, which may be the reason why it so far has not been recognized or noticed by the operations research community. In the last few years, the number of ARO-related works in PSE has increased rapidly, addressing diverse applications in process design ¹⁹, ²⁰, planning and scheduling ¹⁴, ¹⁵, ¹⁶, ²¹, ²², ²³, ²⁴, model predictive control ²⁵, ²⁶, ²⁷, supply chain optimization ²⁸, ²⁹, etc.

In stochastic optimization, one distinguishes between exogenous and endogenous uncertainty. While exogenous uncertainty is not affected by the decision maker’s actions, certain properties of endogenous uncertain parameters are decision-dependent. Endogenous uncertainty has its origin in stochastic programming ³⁰, and mainly two types of endogenous uncertainty have been considered in the literature: type 1 where decisions alter the underlying probability distribution of an uncertain parameter ³¹, ³², ³³, and type 2 where decisions affect the time at which an uncertain parameter materializes or its true value is revealed. The vast majority of existing works address type-2 endogenous uncertainty ³⁴, ³⁵, ³⁶, ³⁷, ³⁸, ³⁹, ⁴⁰, ⁴¹, ⁴², ⁴³. Here, the main challenge is the modeling of decision-dependent nonanticipativity, which is typically achieved by encoding a conditional scenario tree. The size of this scenario tree grows exponentially with the number of uncertain parameters and decisions affecting the uncertainty, which leads to dramatically increased computational complexity compared to the case with only exogenous uncertainty. Hence, most research efforts have focused on the development of more efficient formulations ³⁵, ³⁶, ³⁸, ⁴², ⁴³ and solution methods ³⁵, ³⁹, ⁴⁰, ³⁷, ⁴¹, ⁴³. We refer to Apap and Grossmann ⁴³ for a comprehensive review of stochastic programming with endogenous uncertainty.

Only recently, endogenous uncertainty has also been considered in robust optimization. Poss ⁴⁴ and Nohadani and Sharma ⁴⁵ address static robust optimization with decision-dependent polyhedral uncertainty sets. A similar approach is taken by Lappas and Gounaris ⁴⁶, who further extend their framework to consider multistage robust process scheduling where the materialization of task-specific uncertainties, such as processing time and production yield, depends on the execution of the task ¹⁴, ²²; however, the uncertainty set is only affected by first-stage decisions, and all recourse variables are continuous. Avraamidou and Pistikopoulos ⁴⁷ apply multiparametric programming to address endogenous uncertainty in two-stage robust optimization. Finally, Feng et al. ⁴⁸ consider the multistage case with both continuous and binary recourse as well as uncertainty sets that can be affected by decisions at every stage, for which a decision rule approach is applied to derive tractable approximations.

The above-mentioned references investigate problems in which decisions directly affect the stochastic nature of uncertain parameters, which in the robust optimization context translates into uncertainty sets whose shape, size, or dimensionality may be decision-dependent. Also, although the materialization of uncertain parameters can be decision-dependent, it is assumed that the true values of uncertain parameters are observed immediately after their materialization. In a series of recent works, Vayanos and coworkers ⁴⁹, ⁵⁰, ⁵¹ consider robust optimization with decision-dependent information discovery ³⁸, where decisions determine whether and when uncertain parameters are observed. This constitutes a conceptually different type of endogenous uncertainty, where, instead of an underlying stochastic process, the source of uncertainty is the lack of information, which can be acquired with additional effort. In this case, the uncertainty set is reduced (“shrinks”) with additional observations. The proposed framework provides a means of modeling active learning, which refers to selecting experiments sequentially based on the outcomes of previous experiments. It is hence a promising approach to optimizing the trade-off between exploration and exploitation in sequential decision-making problems, particularly those involving complex constraints. Existing works have applied this approach to the learning of customer behavior through dynamic pricing ⁴⁹ and active preference elicitation with application to housing allocation ⁵¹.

We have come to realize that many problems in PSE require the simultaneous consideration of multiple types of endogenous uncertainty. Examples include strategic capacity planning, design of sensor networks, integrated online process optimization and parameter estimation, and maintenance planning with inspections. To address these problems, we need an approach that can consider all previously described types of endogenous uncertainty simultaneously, which, to the best of our knowledge, does not yet exist in the literature. In this work, we develop such a unified ARO framework, which significantly expands our capability to model data-driven optimization problems under uncertainty, particularly those involving active learning. The main contributions of this work are as follows:

•

We introduce a new refined classification of endogenous uncertainty that more comprehensively reflects the current understanding of the subject. We further provide a unifying robust optimization perspective and highlight the connection between endogenous uncertainty and active learning.
•

We present two-stage and multistage robust optimization formulations and show that they can be used to simultaneously consider all defined types of endogenous uncertainty subject to polyhedral uncertainty sets.
•

A decision rule approach based on previous work ⁴⁸ is employed to develop tractable approximations of the given problems, which involve both continuous and binary recourse variables. One major novelty of this work is the use of auxiliary uncertain parameters to model decision-dependent nonanticipativity in the multistage setting. We derive a reformulation that can be directly solved using off-the-shelf optimization solvers.
•

The versatility and efficacy of the proposed modeling framework are demonstrated in several computational case studies, which include plant redesign, maintenance planning with inspections, optimizing revision points in capacity planning, and production scheduling with active parameter estimation.

The remainder of this paper is organized as follows. In Section 2, we propose a new refined classification of endogenous uncertainty and systematically analyze the different types of endogenous uncertainty from a robust optimization perspective. In Section 3, the connection between endogenous uncertainty and active learning is established. We develop the two-stage formulation incorporating all types of endogenous uncertainty in Section 4 before presenting the multistage formulation in Section 5. The proposed decision rule approach is detailed in Section 6. In Section 7, results from the computational studies are presented. Finally, in Section 8, we close with some concluding remarks.

Notation

We use lowercase and uppercase boldface letters to denote vectors and matrices, respectively, e.g. $\bm{x}\in\mathbb{R}^{n}$ and $\bm{A}\in\mathbb{R}^{n\times m}$ . Scalar quantities are denoted by non-boldface letters. A vector $\bm{x}$ ’s $i$ th element is denoted by $x_{i}$ . We use $\circ$ and $\mathbbm{1}(\cdot)$ to denote the Hadamard multiplication operator and the indicator function, respectively. Furthermore, $\bm{0}$ , $\bm{e}$ , and $\bm{E}$ are the zero vector, all-ones vector, and all-ones matrix, respectively, while $\bm{e}_{i}$ is the standard basis vector whose $i$ th element is 1; their dimensions can be inferred from the context.

2 Endogenous Uncertainty Through the Robust Optimization Lens

As mentioned in Section 1, the literature distinguishes between type-1 and type-2 endogenous uncertainty. Here, we further refine the type-2 classification and propose the following definition of three different types of endogenous uncertainty:

•

Type 1: Decisions alter the probability distribution of uncertain parameters.
•

Type 2a: Decisions determine whether and when uncertain parameters materialize, i.e. actually become physically meaningful.
•

Type 2b: Decisions determine whether and when the true values of uncertain parameters are observed.

Endogenous uncertain parameters can be of more than one type. For example, a company may alter the probability distribution of the demand for its product by changing the selling price, in which case the product demand is a type-1 endogenous uncertain parameter. In addition, if we still have to decide whether or not to launch the product in the first place, the product demand is also type-2a endogenous since it will only materialize if the product is actually available on the market. Moreover, a customer’s product demand is observed when the order is placed; it is type-2b endogenous if we can affect the time of the order, e.g. by specifying it in a contract.

In robust optimization, instead of specifying a probability distribution, uncertainty is described using an uncertainty set, which is the set of all realizations of the uncertain parameters considered possible in the problem. In other words, robust optimization only requires the support of the probability distribution. A two-stage robust optimization problem can be generally formulated as follows:

	$\displaystyle\operatorname*{minimize}_{\bm{x}\in\mathcal{X},\,\bm{\tilde{x}}}$	$\displaystyle\max_{\bm{\xi}\in\Xi}f(\bm{x},\bm{\tilde{x}}(\bm{\xi}),\bm{\xi})$		(1)
	$\displaystyle\mathrm{subject\;to}$	$\displaystyle\bm{g}(\bm{x},\bm{\tilde{x}}(\bm{\xi}),\bm{\xi})\leq\bm{0},\;\bm{\tilde{x}}(\bm{\xi})\in\widetilde{\mathcal{X}}\quad\forall\,\bm{\xi}\in\Xi,$		(1)

where $\bm{x}\in\mathcal{X}$ are the first-stage variables and $\bm{\tilde{x}}\in\widetilde{\mathcal{X}}$ are the second-stage variables, which are functions of the uncertain parameters $\bm{\xi}\in\mathbb{R}^{K}$ . The objective and constraint functions are denoted by $f$ and $\bm{g}$ , respectively. Problem (1) minimizes the worst-case cost while ensuring that the constraints are satisfied for all $\bm{\xi}\in\Xi$ . By applying an epigraph reformulation, a problem of form (1) can be reformulated into a problem of the following form:

	$\displaystyle\operatorname*{minimize}_{\bm{x}\in\mathcal{X},\,\bm{\tilde{x}}}$	$\displaystyle x_{1}$		(2)
	$\displaystyle\mathrm{subject\;to}$	$\displaystyle\bm{g}(\bm{x},\bm{\tilde{x}}(\bm{\xi}),\bm{\xi})\leq\bm{0},\;\bm{\tilde{x}}(\bm{\xi})\in\widetilde{\mathcal{X}}\quad\forall\,\bm{\xi}\in\Xi,$		(2)

where the objective function is deterministic. Formulation (2) will be used in the remainder of this paper as it is more convenient for our discussion and subsequent reformulations.

Remark

There is a common misconception that a robust optimization problem has to consider a worst-case cost function. As it is apparent from (2), in theory, one can also consider any deterministic or deterministic equivalent (e.g. expressed in terms of scenarios) cost function. Constraint satisfaction with respect to the entire uncertainty set can be considered independently from the choice of objective function. Often, a non-worst-case objective function is more appropriate (see Section 7 for two examples).

In the case of endogenous uncertainty in the two-stage setting, the uncertainty depends on the first-stage decisions $\bm{x}$ . For each type of endogenous uncertainty, we show how $\bm{x}$ affect the following three major components of problem (2):

•

the uncertainty set $\Xi$ , which is constant in the case of only exogenous uncertainty but depends on $\bm{x}$ if the uncertainty is endogenous, i.e. $\Xi=\Xi(\bm{x})$ ;

•

the deterministic feasible set (DFS) defined as

\mathcal{F}(\bm{x},\bm{\xi}):=\left\{\bm{\tilde{x}}\in\widetilde{\mathcal{X}}:\bm{g}(\bm{x},\bm{\tilde{x}},\bm{\xi})\leq\bm{0}\right\},

which is the set of feasible second-stage solutions given $\bm{x}$ and $\bm{\xi}$ ;

•

and the second-stage recourse decisions $\bm{\tilde{x}}(\bm{\xi})$ as functions of $\bm{\xi}$ .

Type 1 (uncertainty alteration)

The probability distribution of type-1 endogenous uncertain parameters including its support can be altered by decisions. From the robust optimization perspective, this means that the shape and size of the uncertainty set $\Xi(\bm{x})$ depend on the first-stage decisions $\bm{x}$ . For an example involving two uncertain parameters, $\xi_{A}$ and $\xi_{B}$ , Figure 1a shows the uncertainty sets for two different choices of $\bm{x}$ . One results in a polyhedral uncertainty set with four facets while the other results in a polytope with three facets.

Type 2a (uncertainty materialization)

In the case of type-2a endogenous uncertainty, $\bm{x}$ change the uncertainty set by affecting which uncertain parameters actually materialize. In the example shown in Figure 1b, $\Xi(\bm{x})$ is the gray shaded polytope if both $\xi_{A}$ and $\xi_{B}$ materialize. However, if $\bm{x}$ are chosen such that $\xi_{A}$ does not materialize, $\Xi(\bm{x})$ becomes the solid red line segment that is a one-dimensional set for $\xi_{B}$ ; hence, $\bm{x}$ affect the dimensionality of the uncertainty set. Generally, when some uncertain parameters do not materialize, $\Xi(\bm{x})$ becomes the projection of the full-dimensional uncertainty set onto the space of the materialized uncertain parameters.

If an uncertain parameter does not materialize, it is physically meaningless and should therefore be irrelevant for the problem. More precisely, this means that the following DFS dependence condition has to hold: if $\bm{x}$ are chosen such that some uncertain parameters do not materialize, then $\mathcal{F}(\bm{x},\bm{\xi})$ has to be independent of those unmaterialized parameters. It is important to note that decision-dependent materialization of parameters is not a feature of uncertainty. It equally occurs in deterministic problems. Hence, the deterministic model, in which $\bm{\xi}$ is fixed, should already satisfy the DFS dependence condition, which usually directly extends to the robust formulation.

Finally, recourse decisions cannot depend on unmaterialized parameters. This, however, does not imply that recourse decisions are functions of all materialized parameters. While materialization is a necessary condition for an uncertain parameter to be considered in the recourse function, the sufficient condition is that its true value is observed. This subtle but important distinction is often ignored as almost all existing works on endogenous uncertainty assume that once an uncertain parameter materializes, its true value is automatically observed.

Type 2b (uncertainty observation)

In the case of type-2b endogenous uncertainty, we can decide if and when (in the multistage case) the true values of uncertain parameters are observed. Here, the underlying probability distribution and hence the uncertainty set is unaffected, but the reduction of uncertainty over time depends on when and which observations are made. This is illustrated in Figure 1c, which shows an instance in which the true values of the uncertain parameters $\xi_{A}$ and $\xi_{B}$ are $\bar{\xi}_{A}$ and $\bar{\xi}_{B}$ , respectively. If both parameters are observed, there will be no uncertainty in the second stage as the uncertainty set reduces to the point $(\bar{\xi}_{A},\bar{\xi}_{B})$ . However, if we only observe $\xi_{A}$ and find it to be $\bar{\xi}_{A}$ , the uncertainty with respect to $\xi_{B}$ remains and its reduced uncertainty set becomes the red solid line. Similarly, if we only observe $\xi_{B}$ , the reduced uncertainty set for $\xi_{A}$ becomes the blue solid line. Note that the reduced uncertainty set is conditional, i.e. it depends on the observations made. Clearly, a parameter can only be observed if it materializes. Also, reiterating our previous remark, recourse decisions can only be functions of observed uncertain parameters.

The above discussion references the two-stage robust optimization problem but naturally extends to the multistage case. An overview of the characteristics of the different types of endogenous uncertainty is shown in Table 1.

Table 1: Overview of the impact of decisions on the uncertainty set, the dependence of the DFS on uncertain parameters, and the dependence of the recourse functions on uncertain parameters for the different types of endogenous uncertainty.

	Uncertainty Set	DFS Dependence	Recourse Dependence
Type 1	shape, size	unaffected	unaffected
Type 2a	dimensionality	only dependence on materialized parameters	no dependence on unmaterialized parameters
Type 2b	reduction over time	unaffected	only dependence on observed parameters

3 Endogenous Uncertainty and Active Learning

In the context of statistical learning, active learning is the concept of actively gathering information for the purpose of statistical inference. It is also referred to as optimal experimental design as the decisions regarding information discovery are made with an objective in mind, such as minimizing the number of data points required to achieve a given level of confidence in the resulting statistical model. In this section, we highlight the connection between active learning and endogenous uncertainty, and show how it can be conceptualized in a set-based robust optimization framework.

The goal of learning is to gain insights into unknown properties, which can be viewed as uncertain parameters whose level of uncertainty we wish to reduce. Hence, learning is a means of uncertainty reduction. It does so by making new observations that condition the probability distribution; it does not alter the underlying distribution, which is an important distinction. With this perspective, we can model active learning using the notion of endogenous uncertainty. Let $\bm{\xi}\in\mathbb{R}^{K}$ be the uncertain parameters of interest and $\bm{\zeta}\in\mathbb{R}^{M}$ be uncertain parameters that can be observed before the realization of $\bm{\xi}$ . Before observing any $\bm{\zeta}$ , the probability distribution of $\bm{\xi}$ is $\mathbb{P}(\bm{\xi})$ . The decision maker can now choose which $\bm{\zeta}$ to observe in order to improve the current information state with regard to $\bm{\xi}$ , indicated by the binary variables $\bm{z}\in\{0,1\}^{M}$ where $z_{i}$ equals 1 if $\zeta_{i}$ is observed and 0 otherwise. After observing the chosen parameters, the updated probability distribution becomes $\mathbb{P}(\bm{\xi}\mid\zeta_{i}=\hat{\zeta}_{i}\;\forall\,i\text{ for which }z_{i}=1)$ , where $\bm{\hat{\zeta}}$ denotes a specific observed realization of $\bm{\zeta}$ . In a Bayesian sense, $\mathbb{P}(\bm{\xi})$ and $\mathbb{P}(\bm{\xi}|\bm{\zeta})$ can be interpreted as the prior and posterior distributions, respectively. Clearly, in this setting, $\bm{\zeta}$ are type-2b endogenous uncertain parameters. They may also be type-2a endogenous if their materialization is also decision-dependent. Note that $\bm{\xi}$ are considered exogenous uncertain parameters if their materialization and observation are not decision-dependent and if $\mathbb{P}(\bm{\xi})$ and $\mathbb{P}(\bm{\xi}|\bm{\zeta})$ are fixed.

In a robust optimization setting, uncertainty reduction implies reducing the uncertainty set. Using the same notation as above, uncertainty reduction through active learning can then be characterized as follows: Let $\Omega$ be the uncertainty set for the uncertain parameters $(\bm{\xi},\bm{\zeta})$ before the observation of any $\bm{\zeta}$ , i.e. $(\bm{\xi},\bm{\zeta})\in\Omega$ . Once some $\bm{\zeta}$ are observed (indicated by the binary variables $\bm{z}$ ) with the realization $\bm{\hat{\zeta}}$ , the uncertainty set reduces to $\bar{\Omega}(\bm{z},\bm{\hat{\zeta}})=\{(\bm{\xi},\bm{\zeta}):(\bm{\xi},\bm{\zeta})\in\Xi,\;\bm{\zeta}=\bm{z}\circ\bm{\hat{\zeta}}\}$ . Note that $\bar{\Omega}(\bm{z},\bm{\hat{\zeta}})\subseteq\Omega$ for every possible $\bm{\hat{\zeta}}$ as specified by $\Omega$ .

Example 1.

We consider investing in a large-scale industrial manufacturing plant, where the real plant capacity $q$ is unknown before it is actually built. To reduce the financial risk, we have the option of building a small-scale pilot plant whose achieved capacity $p$ can help better predict the capacity of the industrial-scale plant. The uncertainty set depicted in Figure 2 shows that $p$ and $q$ are highly correlated. If no pilot plant is built, the uncertain parameter $p$ does not materialize. As a result, we can only rely on our initial belief that the marginal uncertainty set for $q$ is $\mathcal{Q}=[q^{\min},q^{\max}]$ . However, if we do build a pilot plant and observe a pilot plant capacity of $\bar{p}\in[p^{\min},p^{\max}]$ , the uncertainty set for $q$ reduces to a significantly smaller $\bar{\mathcal{Q}}(\bar{p})=\{q:\alpha+\beta\bar{p}-\Delta\leq q\leq\alpha+\beta\bar{p}+\Delta\}$ . Note that in this example, $p$ is a type-2a and type-2b uncertain parameter since it only materializes (and can then be observed) if we decide to build the pilot plant.

In Example 1, we use the term marginal uncertainty set and define it as the projection of the full-dimensional uncertainty set onto the space of a subset of variables, in this case $q$ . Generally, given uncertain parameters $\bm{\xi}$ and $\bm{\zeta}$ , observing $\bm{\zeta}$ further restricts the marginal uncertainty set for $\bm{\xi}$ , denoted by $\Xi$ , only if the uncertain parameters are correlated. The geometric interpretation is depicted in Figure 3, where an observation $\bm{\hat{\zeta}}$ results in cutting planes (indicated by the red dashed lines) added to $\Xi$ such that the updated marginal uncertainty set for $\bm{\xi}$ becomes $\overline{\Xi}(\bm{\hat{\zeta}})$ .

Modeling active learning as a robust optimization problem with endogenous uncertainty provides an opportunity for integrated optimization and learning, which are two tasks typically performed separately. Central to this problem is the trade-off between exploration and exploitation since our actions affect both the amount of new information and the reward that we receive. The proposed framework can readily incorporate this trade-off and is particularly well suited for problems with complex constraints and recourse decisions, which often cause difficulties for many alternative methods, such as Bayesian optimization ⁵², reinforcement learning ⁵³, and multi-armed bandit optimization ⁵⁴.

4 Two-Stage Robust Optimization with Endogenous Uncertainty

In the following, we show how type-1, type-2a, and type-2b endogenous uncertainty can be incorporated in a two-stage robust optimization formulation, eventually leading to a unified model that can capture all these types of endogenous uncertainty simultaneously.

4.1 Incorporating Type-1 Endogenous Uncertainty

Consider the following two-stage robust optimization problem:

$\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{\tilde{x}}}$	$\displaystyle x_{1}$	(3)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle\bm{x}\in\mathcal{X}$
	$\displaystyle\left.\begin{array}[]{l}\bm{A}(\bm{\xi})\bm{x}+\bm{\widetilde{A}}(\bm{\xi})\bm{\tilde{x}}(\bm{\xi})\leq\bm{b}(\bm{\xi})\\[4.0pt] \bm{\tilde{x}}(\bm{\xi})\in\widetilde{\mathcal{X}}\end{array}\right\}\quad\forall\,\bm{\xi}\in\Xi(\bm{x}),$

where the dependence of the uncertainty set $\Xi$ on $\bm{x}$ indicates type-1 endogenous uncertainty. Throughout this work, we restrict our discussion to decision-dependent uncertainty sets of the following polyhedral form:

\Xi(\bm{x})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\bm{W}\bm{\xi}\leq\bm{U}\bm{x}\right\},

(4)

where we assume that $\Xi(\bm{x})$ is always nonempty and bounded, i.e. $\Xi(\bm{x})\subseteq\{\bm{\xi}\in\mathbb{R}^{K}:\bm{\xi}^{\min}\leq\bm{\xi}\leq\bm{\xi}^{\max}\}$ for any $\bm{x}\in\mathcal{X}$ .

Remark

Some endogenous uncertainty can be modeled as exogenous uncertainty through a simple transformation. For example, consider a scalar endogenous uncertain parameter $\xi$ with the uncertainty set $\Xi(x)=\{\xi\in\mathbb{R}:0\leq\xi\leq x\}$ , which depends on variable $x$ . In this case, we could define a new uncertain parameter $\tilde{\xi}$ and add the equation $\xi=\tilde{\xi}x$ to the model. The uncertainty set for $\tilde{\xi}$ is then $\widetilde{\Xi}=\{\tilde{\xi}\in\mathbb{R}:0\leq\tilde{\xi}\leq 1\}$ , which does not depend on $x$ , and $\xi$ can now be treated as a variable that is adjustable with respect to $\tilde{\xi}$ .

4.2 Incorporating Type-2a Endogenous Uncertainty

To model type-2a endogenous uncertainty, we introduce binary variables $\bm{y}\in\{0,1\}^{K}$ to encode the materialization of uncertain parameters, i.e. $y_{i}=1$ if and only if $\xi_{i}$ materializes. Then, both type-1 endogenous uncertainty with decision-dependent uncertainty set of the form (4) and type-2a endogenous uncertainty can be captured in a two-stage formulation of the following form:

$\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{\tilde{x}}}$	$\displaystyle x_{1}$	(5)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle\bm{x}\in\mathcal{X},\;\bm{y}\in\mathcal{Y}$
	$\displaystyle\left.\begin{array}[]{l}\bm{A}(\bm{\xi})\bm{x}+\bm{D}(\bm{\xi})\bm{y}+\bm{\widetilde{A}}(\bm{\xi})\bm{\tilde{x}}(\bm{y}\circ\bm{\xi})\leq\bm{b}(\bm{\xi})\\[4.0pt] \bm{\tilde{x}}(\bm{y}\circ\bm{\xi})\in\widetilde{\mathcal{X}}\end{array}\right\}\quad\forall\,\bm{\xi}\in\Xi(\bm{x},\bm{y}),$

where $\mathcal{Y}\subseteq\{0,1\}^{K}$ and

\Xi(\bm{x},\bm{y})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\bm{W}\bm{\xi}\leq\bm{U}\bm{x}+\bm{\overline{U}}\bm{y}\right\},

(6)

assuming that $\Xi(\bm{x},\bm{y})\subseteq\{\bm{\xi}\in\mathbb{R}^{K}:\bm{\xi}^{\min}\leq\bm{\xi}\leq\bm{\xi}^{\max}\}$ for any $\bm{x}\in\mathcal{X}$ and $\bm{y}\in\mathcal{Y}$ . Note that formulation (5) is actually of the same form as (3); we have merely augmented the first-stage variables with $\bm{y}$ and restricted the recourse functions to only depend on materialized parameters, i.e. $\bm{y}\circ\bm{\xi}$ . Here, the assumption is that all materialized uncertain parameters are also observed.

While the recourse dependence property of type-2a endogenous uncertainty is explicitly stated, it may not be immediately obvious how the change of dimensionality of the uncertainty set and the DFS dependence property can be captured in (5). Here, the uncertainty set $\Xi(\bm{x},\bm{y})$ has to be constructed such that for a given $\bm{y}$ , its projection onto the space of materialized uncertain parameters, i.e. $\xi_{i}$ for which $y_{i}=1$ , forms the desired uncertainty set for the given $\bm{y}$ . In addition, the constraints in (5) have to be constructed such that the feasible region is not affected by any $\xi_{i}\in\Xi_{i}(\bm{x},\bm{y})$ for which $y_{i}=0$ , where $\Xi_{i}(\bm{x},\bm{y})$ denotes the projection of $\Xi(\bm{x},\bm{y})$ onto the $\xi_{i}$ -space. Uncertainty sets and constraints satisfying these requirements can be constructed in various ways. To show the generality of formulation (5), we provide the following result.

Proposition 1.

Formulation (5) is general in a sense that it can model the general case in which every possible set of materialized uncertain parameters results in a distinct set of constraints and a distinct $\bm{x}$ -dependent uncertainty set of appropriate dimensionality.

Proof.

We provide a constructive proof for Proposition 1. In the following, we refer to a possible set of materialized uncertain parameters as a materialization scenario. Since $\bm{y}\in\{0,1\}^{K}$ is defined such that each materialization scenario is given by a specific $\bm{y}$ with $y_{i}=1$ if and only if the uncertain parameter $\xi_{i}$ materializes, the set of materialization scenarios is finite (maximum $2^{K}$ ). Let $S$ be the number of materialization scenarios and $\mathcal{S}:=\{1,\dots,S\}$ . We define a set $\mathcal{Y}:=\{\bm{\bar{y}}^{1},\dots,\bm{\bar{y}}^{S}\}$ where $\bm{\bar{y}}^{s}$ is the specific $\bm{y}$ -vector that represents scenario $s$ . In the most general case, each materialization scenario is associated with a distinct set of constraints and a distinct uncertainty set involving only the materialized uncertain parameters. Hence, the two-stage problem can be formulated as follows:

$\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{\tilde{x}}}$	$\displaystyle x_{1}$	(7)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle\bm{x}\in\mathcal{X},\;\bm{y}\in\mathcal{Y}$
	$\displaystyle\bigvee_{s\in\mathcal{S}}\>\left[\begin{array}[]{c}\bm{y}=\bm{\bar{y}}^{s}\\[4.0pt] \left.\begin{array}[]{c}\bm{A}^{s}(\bm{\xi}^{s})\bm{x}+\bm{D}^{s}(\bm{\xi}^{s})\bm{\bar{y}}^{s}+\bm{\widetilde{A}}^{s}(\bm{\xi}^{s})\bm{\tilde{x}}(\bm{\xi}^{s})\leq\bm{b}^{s}(\bm{\xi}^{s})\\[4.0pt] \bm{\tilde{x}}(\bm{\xi}^{s})\in\widetilde{\mathcal{X}}\end{array}\right\}\;\forall\,\bm{\xi}^{s}\in\Xi^{s}(\bm{x})\end{array}\right],$

where the disjunction indicates that only the constraints for the selected materialization scenario need to be satisfied. Here, $\bm{\xi}^{s}$ only contains the $\xi_{i}$ for which $\bar{y}^{s}_{i}=1$ and is therefore a $K^{s}$ -dimensional vector with $K^{s}=\bm{e}^{\top}\bm{\bar{y}}^{s}$ . The $K^{s}$ -dimensional scenario-specific uncertainty set is

\Xi^{s}(\bm{x})=\left\{\bm{\xi}^{s}\in\mathbb{R}^{K^{s}}:\bm{W}^{s}\bm{\xi}^{s}\leq\bm{U}^{s}\bm{x}\right\}\subseteq\left\{\bm{\xi}^{s}\in\mathbb{R}^{K^{s}}:\xi^{\min}_{i}\leq\xi_{i}\leq\xi^{\max}_{i}\;\;\forall\,i\text{ for which }\bar{y}^{s}_{i}=1\right\}.

Similar to the constraints, a disjunction can also be used to define an overall uncertainty set $\Xi(\bm{x},\bm{y})$ . By doing so, problem (7) can be reformulated as

$\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{\tilde{x}}}$	$\displaystyle x_{1}$	(8)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle\bm{x}\in\mathcal{X},\;\bm{y}\in\mathcal{Y}$
	$\displaystyle\bigvee_{s\in\mathcal{S}}\>\left[\begin{array}[]{c}\bm{y}=\bm{\bar{y}}^{s}\\[4.0pt] \bm{A}^{s}(\bm{\xi}^{s})\bm{x}+\bm{D}^{s}(\bm{\xi}^{s})\bm{\bar{y}}^{s}+\bm{\widetilde{A}}^{s}(\bm{\xi}^{s})\bm{\tilde{x}}(\bm{\bar{y}}^{s}\circ\bm{\xi})\leq\bm{b}^{s}(\bm{\xi}^{s})\\[4.0pt] \bm{\tilde{x}}(\bm{\bar{y}}^{s}\circ\bm{\xi})\in\widetilde{\mathcal{X}}\end{array}\right]\quad\forall\,\bm{\xi}\in\Xi(\bm{x},\bm{y})$

with

\Xi(\bm{x},\bm{y})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\bigvee_{s\in\mathcal{S}}\>\left[\begin{array}[]{c}\bm{y}=\bm{\bar{y}}^{s}\\[4.0pt] \bm{W}^{s}\bm{\xi}^{s}\leq\bm{U}^{s}\bm{x}\\[4.0pt] \xi_{i}=0\;\;\forall\,i\text{ for which }\bar{y}^{s}_{i}=0\end{array}\right]\right\}.

Note that in $\Xi(\bm{x},\bm{y})$ , we set unmaterialized uncertain parameters to zero; this is a convenient but arbitrary choice since for a given materialization scenario $s$ , only the projection of $\Xi(\bm{x},\bm{y})$ onto the $\bm{\xi}^{s}$ -space matters.

With additional binary variables $\bm{\omega}\in\{0,1\}^{S}$ , we can reformulate the disjunctions and arrive at the following formulation:

$\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{\omega},\,\bm{y},\,\bm{\tilde{x}}}$	$\displaystyle x_{1}$	(9)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle\bm{x}\in\mathcal{X},\;\bm{\omega}\in\{0,1\}^{S},\;\bm{y}\in\mathcal{Y}$
	$\displaystyle\sum_{s\in\mathcal{S}}\omega^{s}=1,\quad\omega^{s}=1\Leftrightarrow\bm{y}=\bm{\bar{y}}^{s}\quad\forall\,s\in\mathcal{S}$
	$\displaystyle\bm{A}^{s}(\bm{\xi}^{s})\bm{x}+\bm{D}^{s}(\bm{\xi}^{s})\bm{y}+\bm{\widetilde{A}}^{s}(\bm{\xi}^{s})\bm{\tilde{x}}(\bm{y}\circ\bm{\xi})\leq\bm{b}^{s}(\bm{\xi}^{s})+M^{s}\bm{e}(1-\omega^{s})\quad\forall\,s\in\mathcal{S},\,\bm{\xi}\in\Xi(\bm{x},\bm{\omega},\bm{y})$
	$\displaystyle\bm{\tilde{x}}(\bm{y}\circ\bm{\xi})\in\widetilde{\mathcal{X}}\quad\forall\,\bm{\xi}\in\Xi(\bm{x},\bm{\omega},\bm{y})$

with

\Xi(\bm{x},\bm{\omega},\bm{y})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\begin{array}[]{l}\bm{W}^{s}\bm{\xi}^{s}\leq\bm{U}^{s}\bm{x}+\widehat{M}^{s}\bm{e}(1-\omega^{s})\quad\forall\,s\in\mathcal{S}\\[4.0pt] \bm{\xi}^{\min}\circ\bm{y}\leq\bm{\xi}\leq\bm{\xi}^{\max}\circ\bm{y}\end{array}\right\},

where $M^{s}$ and $\widehat{M}^{s}$ are vectors of sufficiently large parameters. As the statement $\omega^{s}=1\Leftrightarrow\bm{y}=\bm{\bar{y}}^{s}$ can be expressed as as set of linear constraints, e.g. $\bm{\bar{y}}^{s}\omega^{s}\leq\bm{y}\leq\bm{e}+(\bm{\bar{y}}^{s}-\bm{e})\omega^{s}$ , formulation (9) can be easily transformed into the form of (5) by appropriate redefinition of variables, constraint matrices, and right-hand sides. ∎

4.3 Incorporating Type-2b Endogenous Uncertainty

Finally, we introduce additional binary variables $\bm{z}\in\{0,1\}^{K}$ to indicate which uncertain parameters are observed, and we arrive at the following unified two-stage robust optimization formulation that considers type-1, type-2a, and type-2b endogenous uncertainty:

$\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{z},\,\bm{\tilde{x}}}$	$\displaystyle x_{1}$	(10)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle\bm{x}\in\mathcal{X},\;\bm{y}\in\mathcal{Y},\;\bm{z}\in\mathcal{Z},\;\bm{z}\leq\bm{y}$
	$\displaystyle\left.\begin{array}[]{l}\bm{A}(\bm{\xi})\bm{x}+\bm{D}(\bm{\xi})\bm{y}+\bm{H}(\bm{\xi})\bm{z}+\bm{\widetilde{A}}(\bm{\xi})\bm{\tilde{x}}(\bm{z}\circ\bm{\xi})\leq\bm{b}(\bm{\xi})\\[4.0pt] \bm{\tilde{x}}(\bm{z}\circ\bm{\xi})\in\widetilde{\mathcal{X}}\end{array}\right\}\quad\forall\,\bm{\xi}\in\Xi(\bm{x},\bm{y}),$

where $\mathcal{Z}\subseteq\{0,1\}^{K}$ . The recourse decisions $\bm{\tilde{x}}(\bm{z}\circ\bm{\xi})$ can only be functions of the observed parameters. Also, by requiring $\bm{z}\leq\bm{y}$ , parameters can only be observed if they materialize. The formulation of the uncertainty set does not change from (6):

\Xi(\bm{x},\bm{y})=\left\{\bm{\xi}\in\mathbb{R}^{K}:\bm{W}\bm{\xi}\leq\bm{U}\bm{x}+\bm{\overline{U}}\bm{y}\right\}.

5 Multistage Robust Optimization with Endogenous Uncertainty

In multistage robust optimization with endogenous uncertainty, decisions that affect the uncertainty can also be recourse variables. Moreover, while the two-stage problem can only consider decisions determining if uncertain parameters materialize or are observed, multistage formulations can incorporate decisions affecting when, i.e. in which stage, uncertain parameters materialize or are observed.

Let $\mathcal{T}=\{1,\dots,T\}$ denote the set of stages and $\mathcal{I}$ the set of uncertain parameters. An uncertain parameter $i\in\mathcal{I}$ can materialize anytime between (and including) given stages $\tau_{i}$ and $\bar{\tau}_{i}$ . We define a vector of uncertain parameters $\bm{\xi}_{t}\in\mathbb{R}^{K_{t}}$ that contains all uncertain parameters $i$ for which $\tau_{i}=t$ . We further define all uncertain parameters that can materialize in or prior to stage $t$ as $\bm{\xi}_{[t]}=(\bm{\xi}_{1},\dots,\bm{\xi}_{t})$ , and $\bm{\xi}=\bm{\xi}_{[T]}$ . The set of uncertain parameters for which $\tau_{i}=t$ is denoted by $\widetilde{\mathcal{I}}_{t}$ , and we have $\widetilde{\mathcal{I}}_{t}\cap\widetilde{\mathcal{I}}_{t^{\prime}}=\emptyset$ for all $t\neq t^{\prime}$ and $\mathcal{I}=\bigcup_{t\in\mathcal{T}}\widetilde{\mathcal{I}}_{t}$ . The set of uncertain parameters that can materialize in stage $t$ is denoted by $\bar{\mathcal{I}}_{t}$ . In the special case where every uncertain parameter can only materialize in one stage, $\bar{\mathcal{I}}_{t}=\widetilde{\mathcal{I}}_{t}$ . Similarly, let $\widehat{\mathcal{I}}_{t}$ denote the set of uncertain parameters that can be observed in stage $t$ , and $\hat{\tau}_{i}$ and $\tilde{\tau}_{i}$ be the stages between which uncertain parameter $i$ can be observed. Since the observation of an uncertain parameter can only occur after its materialization, we have $\hat{\tau}_{i}\geq\tau_{i}$ and $\tilde{\tau}_{i}\geq\bar{\tau}_{i}$ .

The general multistage robust optimization problem with type-1, type-2a, and type-2b endogenous uncertainty can be formulated as follows:

$\displaystyle\operatorname*{minimize}_{\bm{x},\,\bm{y},\,\bm{z}}$	$\displaystyle x_{11}$	(11)
$\displaystyle\begin{array}[]{l}\mathrm{subject\;to}\;\vphantom{\sum\limits_{t^{\prime}=1}^{t}}\\[10.0pt] \vphantom{\bm{x}_{t}(\bm{\xi}_{[t]})\in\mathcal{X}_{t}}\\[4.0pt] \vphantom{\bm{y}_{t}(\bm{\xi}_{[t]})\in\mathcal{Y}_{t}}\\[4.0pt] \vphantom{\bm{z}_{t}(\bm{\xi}_{[t]})\in\mathcal{Z}_{t}}\end{array}$	$\displaystyle\left.\begin{array}[]{l}\sum\limits_{t^{\prime}=1}^{t}\left[\bm{A}_{tt^{\prime}}(\bm{\xi}_{[t]})\bm{x}_{t^{\prime}}+\bm{D}_{tt^{\prime}}(\bm{\xi}_{[t]})\bm{y}_{t^{\prime}}+\bm{H}_{tt^{\prime}}(\bm{\xi}_{[t]})\bm{z}_{t^{\prime}}\right]\leq\bm{b}_{t}(\bm{\xi}_{[t]})\\[10.0pt] \bm{x}_{t}=\bm{x}_{t}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t]}\right),\;\bm{x}_{t}\in\mathcal{X}_{t}\\[4.0pt] \bm{y}_{t}=\bm{y}_{t}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t]}\right),\;\bm{y}_{t}\in\mathcal{Y}_{t}\\[4.0pt] \bm{z}_{t}=\bm{z}_{t}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t]}\right),\;\bm{z}_{t}\in\mathcal{Z}_{t}\end{array}\right\}$
	$\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\forall\,t\in\mathcal{T},\,\bm{\xi}_{[t]}\in\Xi_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-2]})$

where $\mathcal{Y}_{t}\in\{0,1\}^{|\bar{\mathcal{I}}_{t+1}|}$ and $\mathcal{Z}_{t}\in\{0,1\}^{|\widehat{\mathcal{I}}_{t+1}|}$ . The notation is such that $\bm{x}_{[t]}=(\bm{x}_{1},\dots,\bm{x}_{t})$ , $\bm{x}=\bm{x}_{[T]}$ , $\bm{y}_{[t]}=(\bm{y}_{1},\dots,\bm{y}_{t})$ , $\bm{y}=\bm{y}_{[T]}$ , $\bm{z}_{[t]}=(\bm{z}_{1},\dots,\bm{z}_{t})$ , and $\bm{z}=\bm{z}_{[T]}$ . The binary variable $y_{ti}$ equals 1 if uncertain parameter $i$ materializes in stage $t+1$ , whereas the binary variable $z_{ti}$ equals 1 if uncertain parameter $i$ is observed in stage $t+1$ . We use auxiliary variables $\bm{\bar{z}}_{t-1}=\left(\sum_{t^{\prime}=0}^{t-1}z_{t^{\prime}1},\dots,\sum_{t^{\prime}=0}^{t-1}z_{t^{\prime}K_{[t]}}\right)$ with $K_{[t]}=\sum_{t^{\prime}=1}^{t}K_{t^{\prime}}$ to indicate which uncertain parameters in $\bm{\xi}_{[t]}$ are observed up to stage $t$ . The recourse variables in stage $t$ , i.e. $\bm{x}_{t}$ , $\bm{y}_{t}$ , and $\bm{z}_{t}$ , are functions of $\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t]}$ , which represent the uncertain parameters observed up to stage $t$ . For notational convenience, we assume that $\bm{\xi}_{1}=\xi_{11}=1$ , $\bm{y}_{0}=y_{01}=1$ , and $\bm{z}_{0}=z_{01}=1$ . The stage-specific uncertainty set $\Xi_{t}$ is defined such that $\Xi_{1}=\{\bm{\xi}_{1}\in\mathbb{R}:\xi_{11}=1\}$ , and for $t\geq 2$ :

\Xi_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-2]})=\left\{\bm{\xi}_{[t]}\in\mathbb{R}^{K_{[t]}}:\begin{array}[]{l}\bm{W}_{t}\bm{\xi}_{[t]}\leq\sum\limits_{t^{\prime}=1}^{t-1}\left[\bm{U}_{tt^{\prime}}\bm{x}_{t^{\prime}}+\bm{\overline{U}}_{tt^{\prime}}\bm{y}_{t^{\prime}}\right]\\[8.0pt] \bm{x}_{t^{\prime}}=\bm{x}_{t^{\prime}}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t^{\prime}]}\right)\quad\forall\,t^{\prime}=1,\dots,t-1\\[4.0pt] \bm{y}_{t^{\prime}}=\bm{y}_{t^{\prime}}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t^{\prime}]}\right)\quad\forall\,t^{\prime}=1,\dots,t-1\\[4.0pt] \bm{z}_{t^{\prime}}=\bm{z}_{t^{\prime}}\left(\bm{\bar{z}}_{t-1}\circ\bm{\xi}_{[t^{\prime}]}\right)\quad\forall\,t^{\prime}=1,\dots,t-1\end{array}\right\},

(12)

where the polyhedral set directly depends on $\bm{x}_{[t-1]}$ and $\bm{y}_{[t-1]}$ , which in turn are functions of observed uncertain parameters given by $\bm{z}_{[t-2]}$ . We assume that $\Xi_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-2]})\subseteq\{\bm{\xi}_{[t]}\in\mathbb{R}^{K_{[t]}}:\bm{\xi}^{\min}_{[t]}\leq\bm{\xi}_{[t]}\leq\bm{\xi}^{\max}_{[t]}\}$ for all feasible $(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-2]})$ .

The linear constraints in problem (11) include the following inequalities:


	$\displaystyle\sum_{t^{\prime}=\tau_{i}}^{\bar{\tau}_{i}}y_{t^{\prime}-1,i}\leq 1\quad\forall\,i\in\mathcal{I}$		(13a)
	$\displaystyle\sum_{t^{\prime}=\hat{\tau}_{i}}^{\tilde{\tau}_{i}}z_{t^{\prime}-1,i}\leq 1\quad\forall\,i\in\mathcal{I}$		(13b)
	$\displaystyle\sum_{t^{\prime}=\tau_{i}}^{\min\{t,\bar{\tau}_{i}\}}y_{t^{\prime}-1,i}\geq z_{t-1,i}\quad\forall\,t\in\mathcal{T}\setminus\{1\},\,i\in\widehat{\mathcal{I}}_{t}.$		(13c)

Constraints (13a) state that each uncertain parameter $i$ can only materialize between stages $\tau_{i}$ and $\bar{\tau}_{i}$ ; analogously, uncertain parameter $i$ can only be observed between stages $\hat{\tau}_{i}$ and $\tilde{\tau}_{i}$ according to (13b). Inequalities (13c) ensure that an uncertain parameter can only be observed after it has materialized.

6 Decision Rule Approach

In this section, we present a decision rule approach to (approximately) solve the general multistage robust optimization problem under endogenous uncertainty, which reduces to the two-stage case when $T=2$ . The proposed approach is an extension of the method presented in Feng et al. ⁴⁸ and relies on the concept of lifted uncertainty ⁵⁵. Here, the main innovation is the use of auxiliary uncertain parameters to model decision-dependent nonanticipativity.

For ease of exposition and for tractability reasons, we consider the case of fixed recourse, i.e. $\bm{A}_{tt^{\prime}}$ , $\bm{D}_{tt^{\prime}}$ , and $\bm{H}_{tt^{\prime}}$ do not depend on $\bm{\xi}_{[t]}$ . In theory, the presented approach can be adapted to consider random recourse ¹¹, however, at considerable additional computational cost. We further assume that $\bm{b}_{t}(\bm{\xi}_{[t]})=\bm{B}_{t}\bm{\xi}_{[t]}$ and $\mathcal{X}_{t}=\mathbb{R}^{\bar{N}_{t}}\times\{0,1\}^{\widehat{N}_{t}}$ . Note that $\bm{x}_{t}$ contain both continuous and binary variables, and we define $\bm{x}_{t}=(\bm{x}^{\mathrm{c}}_{t},\bm{x}^{\mathrm{b}}_{t})$ such that $\bm{x}^{\mathrm{c}}_{t}\in\mathbb{R}^{\bar{N}_{t}}$ and $\bm{x}^{\mathrm{b}}_{t}\in\{0,1\}^{\widehat{N}_{t}}$ .

6.1 Decision-Dependent Nonanticipativity

In problem (11), recourse variables can only be adjusted based on the values of observed uncertain parameters; this is referred to as nonanticipativity. As a result, any decision rule applied to a recourse variable can only be a function of uncertain parameters that have been observed up to the corresponding stage. Suppose we apply decision rules with a linear structure such that, for example, $\bm{x}_{t}=\bm{\widetilde{X}}_{t}\bm{\xi}_{[t]}$ . Then, intuitively, one may attempt to model nonanticipavity by incorporating additional constraints that force decision rule parameters corresponding to unobserved uncertain parameters to be zero, e.g. $-M\bar{z}_{t-1,i}\bm{e}\leq\bm{\tilde{x}}_{ti}\leq M\bar{z}_{t-1,i}\bm{e}$ with $M$ being a sufficiently large parameter. In the two-stage case, where $\bm{z}$ are only first-stage variables, this is a viable approach. However, in a multistage setting, $\bm{z}_{[t-1]}$ can themselves be adjustable recourse variables; in that case, the added constraints also require $\bm{\widetilde{X}}_{t}$ to be adjustable. Simply applying another decision rule to $\bm{\widetilde{X}}_{t}$ does not resolve the issue since the decision rule parameters for $\bm{\widetilde{X}}_{t}$ would also have to be adjustable because of their dependence on $\bm{z}_{[t-1]}$ . For fixed uncertainty sets, Vayanos et al. ³⁸ propose an approach that divides the uncertainty set into a number of preselected subsets $\mathcal{S}$ and introduces recourse variables for each subset $s\in\mathcal{S}$ . As a result, constant $\bm{z}_{s}$ can be selected for each subset $s$ , which allows restrictions to be imposed on the decision rule parameters for the remaining recourse variables such that nonanticipativity is satisfied. However, this approach is severely computationally intractable if the number of uncertainty subsets is large, which is only exacerbated in the case of problem (11) where the uncertainty set is generally not fixed.

We propose an alternative, more tractable approach to incorporate decision-dependent nonanticipativity. Notice that given a linear decision rule structure, the dependence of a recourse variable on an uncertain parameter is equally eliminated if, instead of setting the corresponding decision rule coefficients to zero, the uncertain parameter itself is set to zero. In the special case where an uncertain parameter can only materialize in one stage and its materialization directly leads to its observation (i.e. $\bm{y}=\bm{z}$ ), this can be easily incorporated by including the following inequalities in the definition of the uncertainty set $\Xi_{t}$ :

\bm{\xi}^{\min}_{t^{\prime}}\circ\bm{y}_{t^{\prime}-1}\leq\bm{\xi}_{t^{\prime}}\leq\bm{\xi}^{\max}_{t^{\prime}}\circ\bm{y}_{t^{\prime}-1}\quad\forall\,t^{\prime}=1,\dots,t,

(14)

which are valid since, as noted in the proof of Proposition 1, the values of unmaterialized parameters can be set to zero without loss of generality. However, inequalities (14) cannot capture the general case where there may be multiple stages in which an uncertain parameter can materialize or where despite being already materialized, one can still decide whether or not to observe the true value of the uncertain parameter in a given stage.

To address the general case, we first generalize (14) such that uncertain parameters that can materialize in multiple stages are considered:

\xi^{\min}_{\tau_{i}i}\sum_{t^{\prime}=\tau_{i}}^{\min\{t,\bar{\tau}_{i}\}}y_{t^{\prime}-1,i}\leq\xi_{\tau_{i}i}\leq\xi^{\max}_{\tau_{i}i}\sum_{t^{\prime}=\tau_{i}}^{\min\{t,\bar{\tau}_{i}\}}y_{t^{\prime}-1,i}\quad\forall\,i\in\bigcup_{t^{\prime}=1}^{t}\widetilde{\mathcal{I}}_{t^{\prime}}.

(15)

We then introduce $\tilde{\tau}_{i}-\hat{\tau}_{i}+1$ auxiliary uncertain parameters $\zeta_{\hat{\tau}_{i}i},\dots,\zeta_{\tilde{\tau}_{i}i}$ for every $i\in\mathcal{I}$ . Note that for an original uncertain parameter $i$ for which $\tau_{i}=\tilde{\tau}_{i}$ and $y_{\tau_{i}-1,i}=z_{\tau_{i}-1,i}$ (which corresponds to the above-mentioned special case), we do not actually require auxiliary uncertain parameters; however, for ease of notation, we assume that we have auxiliary uncertain parameters associated with every $i\in\mathcal{I}$ . We incorporate the following inequalities in the definition of the uncertainty set for stage $t$ :


	$\displaystyle\xi_{\tau_{i}i}-\xi^{\max}_{\tau_{i}i}(1-z_{t^{\prime}-1,i})\leq\zeta_{t^{\prime}i}\leq\xi_{\tau_{i}i}+M(1-z_{t^{\prime}-1,i})\quad\forall\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}}$		(16a)
	$\displaystyle\xi^{\min}_{\tau_{i}i}z_{t^{\prime}-1,i}\leq\zeta_{t^{\prime}i}\leq\xi^{\max}_{\tau_{i}i}z_{t^{\prime}-1,i}\quad\forall\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}},$		(16b)

where $M$ is a sufficiently large big-M parameter. Inequalities (16) enforce that $\zeta_{t^{\prime}i}$ takes the value of $\xi_{\tau_{i}i}$ if $\xi_{\tau_{i}i}$ is observed in stage $t^{\prime}$ , and is zero otherwise.

With a slight abuse of notation, we define the following new uncertainty set that considers the original and the auxiliary uncertain parameters:

		$\displaystyle\Theta_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-1]})=\left\{(\bm{\xi}_{[t]},\bm{\zeta}_{[t]})\in\mathbb{R}^{\sum_{t^{\prime}=1}^{t}K_{t^{\prime}}+\|\widehat{\mathcal{I}}_{t^{\prime}}\|}:\sum_{t^{\prime}=1}^{t}\left[\bm{W}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}+\bm{\overline{W}}_{tt^{\prime}}\bm{\zeta}_{t^{\prime}}\right]\right.$
		$\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\leq\left.\sum\limits_{t^{\prime}=1}^{t-1}\left[\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{x}^{\mathrm{c}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{x}^{\mathrm{b}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\overline{U}}_{tt^{\prime}}\bm{y}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\widehat{U}}_{tt^{\prime}}\bm{z}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})\right]\right\},$

where $\bm{W}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times K_{t^{\prime}}}$ , $\bm{\overline{W}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times|\widehat{\mathcal{I}}_{t^{\prime}}|}$ , $\bm{U}^{\mathrm{c}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times\bar{N}_{t^{\prime}}}$ , $\bm{U}^{\mathrm{b}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times\widehat{N}_{t^{\prime}}}$ , $\bm{\overline{U}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times|\bar{\mathcal{I}}_{t^{\prime}+1}|}$ , and $\bm{\widehat{U}}_{tt^{\prime}}\in\mathbb{R}^{Q_{t}\times|\widehat{\mathcal{I}}_{t^{\prime}+1}|}$ are chosen to incorporate the inequalities from the original uncertainty set (12) as well as inequalities (15) and (16). Notice that the recourse variables $\bm{x}^{\mathrm{c}}_{t}$ , $\bm{x}^{\mathrm{b}}_{t}$ , $\bm{y}_{t}$ , and $\bm{z}_{t}$ now depend on $\bm{\zeta}_{[t]}$ , which represent the uncertain parameters observed up to stage $t$ .

6.2 Lifted Uncertainty

For each auxiliary uncertain parameter $\zeta_{ti}$ , we introduce two sets of new uncertain parameters and hence create a higher-dimensional lifted uncertainty space, which will facilitate the design of flexible decision rules. To this end, the bounded marginal support of $\zeta_{ti}$ , which is the same as the one of $\xi_{\tau_{i}i}$ , is partitioned into $r_{i}$ pieces defined by its lower and upper bounds as well as $r_{i}-1$ breakpoints, and we denote these points by $p_{i}^{j}$ with $j=0,1,\dots,r_{i}$ such that

\xi_{\tau_{i}i}^{\min}=p_{i}^{0}<p_{i}^{1}<\cdots<p_{i}^{r_{i}-1}<p_{i}^{r_{i}}=\xi_{\tau_{i}i}^{\max}.

We then define two lifting operators, $\bar{L}_{t}:\mathbb{R}^{|\widehat{\mathcal{I}}_{t}|}\mapsto\mathbb{R}^{\overline{K}_{t}}$ and $\widehat{L}_{t}:\mathbb{R}^{|\widehat{\mathcal{I}}_{t}|}\mapsto\{0,1\}^{\widehat{K}_{t}}$ , for each $t\in\mathcal{T}$ . The first lifting operator $\bar{L}_{t}(\bm{\zeta}_{t})$ maps $\bm{\zeta}_{t}$ onto a $\overline{K}_{t}$ -dimensional space with $\overline{K}_{t}=\sum_{i\in\widehat{\mathcal{I}}_{t}}r_{i}$ such that each lifted parameter is a piecewise linear function of the corresponding $\zeta_{ti}$ and is defined as follows:

\bar{L}_{ti}^{j}(\bm{\zeta}_{t}):=\left\{\begin{array}[]{lcl}{\zeta_{ti}}&\text{if}&r_{i}=1\\ {\inf\left\{\zeta_{ti},p_{i}^{j}\right\}}&\text{if}&r_{i}>1,\,j=1\\ {\sup\left\{\inf\left\{\zeta_{ti},p_{i}^{j}\right\}-p_{i}^{j-1},0\right\}}&\text{if}&r_{i}>1,\,j=2,\ldots,r_{i}.\end{array}\right.

By construction, we have $\zeta_{ti}=\sum_{j=1}^{r_{i}}\bar{L}_{ti}^{j}(\bm{\zeta}_{t})$ . The second lifting operator $\widehat{L}_{t}(\bm{\zeta}_{t})$ maps $\bm{\zeta}_{t}$ onto a $\widehat{K}_{t}$ -dimensional space with $\widehat{K}_{t}=\sum_{i\in\widehat{\mathcal{I}}_{t}}g_{i}$ and $g_{i}=\max\{1,\,r_{i}-1\}$ . It is defined such that

\widehat{L}_{ti}^{j}(\bm{\zeta}_{t}):=\left\{\begin{array}[]{lcl}1&\text{if}&r_{i}=1\\ {\mathbbm{1}(\zeta_{ti}\geq p_{i}^{j})}&\text{if}&r_{i}>1,\,j=1,\ldots,g_{i},\end{array}\right.

which is a piecewise constant function of $\zeta_{ti}$ .

Using the lifting operators, we can define the following lifted endogenous uncertainty set:

		$\displaystyle\widetilde{\Theta}^{\prime}_{t}(\bm{x}_{[t-1]},\bm{y}_{[t-1]},\bm{z}_{[t-1]})=\left\{(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\mathbb{R}^{\widetilde{K}_{[t]}}:\vphantom{\begin{array}[]{l}\sum\limits_{t^{\prime}=1}^{t}\\[8.0pt] \bm{\bar{\zeta}}_{t^{\prime}}\end{array}}\right.$
		$\displaystyle\quad\left.\begin{array}[]{l}\sum\limits_{t^{\prime}=1}^{t}\left[\bm{W}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}+\bm{\overline{W}}_{tt^{\prime}}\bm{\zeta}_{t^{\prime}}\right]\leq\sum\limits_{t^{\prime}=1}^{t-1}\left[\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{x}^{\mathrm{c}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{x}^{\mathrm{b}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\overline{U}}_{tt^{\prime}}\bm{y}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\widehat{U}}_{tt^{\prime}}\bm{z}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})\right]\\[8.0pt] \bm{\bar{\zeta}}_{t^{\prime}}=\bar{L}_{t^{\prime}}\left(\bm{\zeta}_{t^{\prime}}\right),\;\bm{\hat{\zeta}}_{t^{\prime}}=\widehat{L}_{t^{\prime}}\left(\bm{\zeta}_{t^{\prime}}\right)\quad\forall\,t^{\prime}=1,\dots,t\end{array}\right\},$

where $\widetilde{K}_{[t]}=\sum_{t^{\prime}=1}^{t}K_{t^{\prime}}+|\widehat{\mathcal{I}}_{t^{\prime}}|+\overline{K}_{t^{\prime}}+\widehat{K}_{t^{\prime}}$ , and for all $i\in\widehat{\mathcal{I}}_{t}$ , $\bm{\bar{\zeta}}_{ti}=(\bar{\zeta}_{ti}^{1},\ldots,\bar{\zeta}_{ti}^{r_{i}})$ , $\bar{L}_{ti}=(\bar{L}_{ti}^{1},\ldots,\bar{L}_{ti}^{r_{i}})$ , $\bm{\hat{\zeta}}_{ti}=(\hat{\zeta}_{ti}^{1},\ldots,\hat{\zeta}_{ti}^{g_{i}})$ , and $\widehat{L}_{ti}=(\widehat{L}_{ti}^{1},\ldots,\widehat{L}_{ti}^{g_{i}})$ . The main purpose of defining such a lifted uncertainty set is to enable the construction of linear decision rules in the space of the lifted uncertain parameters. Notice that $\widetilde{\Theta}^{\prime}_{t}$ is an open set due to the discontinuity in $\widehat{L}$ . Now consider problem (11) with $\Xi_{t}$ replaced by $\widetilde{\Theta}^{\prime}_{t}$ and the recourse variables restricted to be linear functions of $\bm{\bar{\zeta}}$ and $\bm{\hat{\zeta}}$ , and a second problem that is the same except that it considers the convex hull of the closure of $\widetilde{\Theta}^{\prime}_{t}$ , i.e. $\mathrm{conv}(\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t}))$ , instead of $\widetilde{\Theta}^{\prime}_{t}$ . Following the same arguments as in Bertsimas and Georghiou ¹¹, one can show that these two problems are equivalent in a sense that they have the same optimal value and that there is a one-to-one mapping between their feasible and optimal solutions.

It is difficult to obtain a closed-form description of $\mathrm{conv}(\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t}))$ as it requires finding all vertices of $\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t})$ , which are decision-dependent. Moreover, the number of vertices grows exponentially with the dimensionality of $\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t})$ such that including them would render most problems of practical relevance computationally intractable. Therefore, instead of $\mathrm{conv}(\mathrm{cl}(\widetilde{\Theta}^{\prime}_{t}))$ , we consider an outer approximation, which we construct as follows. For each $t\in\mathcal{T}$ , $i\in\widehat{\mathcal{I}}_{t}$ , and $j=1,\dots,r_{i}$ , we define a set of vertices $\widetilde{\mathcal{V}}_{ti}^{j}$ in the lifted space:

\widetilde{\mathcal{V}}_{ti}^{j}:=\left\{\bm{\tilde{v}}=\left(v,\,\bm{\bar{v}},\,\bm{\hat{v}}\right):\begin{array}[]{l}v\in\{p_{i}^{j-1},p_{i}^{j}\}\\ \bm{\bar{v}}=\bar{L}_{ti}(\bm{e}_{ti}v)\\ \bm{\hat{v}}=\lim\limits_{\zeta_{ti}\rightarrow v,\,\zeta_{ti}\in[p_{i}^{j-1},p_{i}^{j}]}\widehat{L}_{ti}(\bm{e}_{ti}\zeta_{ti})\end{array}\right\},

which allows the following convex hull formulation of the closure of the marginal support of $\bm{\tilde{\zeta}}_{ti}$ :

Z_{ti}=\left\{\bm{\tilde{\zeta}}_{ti}=(\zeta_{ti},\bm{\bar{\zeta}}_{ti},\bm{\hat{\zeta}}_{ti}):\begin{array}[]{l}\sum\limits_{\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}}\lambda(\bm{\tilde{v}})=1\\ \zeta_{ti}=\sum\limits_{\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}}\lambda(\bm{\tilde{v}})v\\ \bm{\bar{\zeta}}_{ti}=\sum\limits_{\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}}\lambda(\bm{\tilde{v}})\bm{\bar{v}}\\ \bm{\hat{\zeta}}_{ti}=\sum\limits_{\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}}\lambda(\bm{\tilde{v}})\bm{\hat{v}}\\ \lambda(\bm{\tilde{v}})\in\mathbb{R}_{+}\quad\forall\,\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}\end{array}\right\},

where $\widetilde{\mathcal{V}}_{ti}=\bigcup_{j=1}^{r_{i}}\widetilde{\mathcal{V}}_{ti}^{j}$ and $\lambda(\bm{\tilde{v}})$ denotes the coefficient associated with a particular vertex $\bm{\tilde{v}}\in\widetilde{\mathcal{V}}_{ti}$ . We can now obtain the following closed-form expression of an outer approximation of $\mathrm{conv}(\mathrm{cl}(\widetilde{\Xi}^{\prime}_{t}))$ :

		$\displaystyle\left\{\{\bm{\xi}_{[t]}\}\times\prod_{t^{\prime}=1}^{t}\prod_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}Z_{t^{\prime}i}\right\}\bigcap\left\{(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\mathbb{R}^{\widetilde{K}_{[t]}}:\sum\limits_{t^{\prime}=1}^{t}\left[\bm{W}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}+\bm{\overline{W}}_{tt^{\prime}}\bm{\zeta}_{t^{\prime}}\right]\right.$		(17)
		$\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\leq\left.\sum\limits_{t^{\prime}=1}^{t-1}\left[\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{x}^{\mathrm{c}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{x}^{\mathrm{b}}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\overline{U}}_{tt^{\prime}}\bm{y}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})+\bm{\widehat{U}}_{tt^{\prime}}\bm{z}_{t^{\prime}}(\bm{\zeta}_{[t^{\prime}]})\right]\right\}.$		(17)

6.3 Decision Rule Approximation

Following a similar approach as in Feng et al. (2020), we apply decision rules that are linear functions of the lifted uncertain parameters and parameterized as follows:


	$\displaystyle\bm{x}^{\mathrm{c}}_{t}=\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i}\bm{\bar{\zeta}}_{t^{\prime}i}+\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i}\bm{\hat{\zeta}}_{t^{\prime}i}\right)$		(18a)
	$\displaystyle\bm{x}^{\mathrm{b}}_{t}=\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i}\bm{\hat{\zeta}}_{t^{\prime}i},\quad\bm{y}_{t}=\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\bm{\widehat{Y}}_{tt^{\prime}i}\bm{\hat{\zeta}}_{t^{\prime}i},\quad\bm{z}_{t}=\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\bm{\widehat{Z}}_{tt^{\prime}i}\bm{\hat{\zeta}}_{t^{\prime}i},$		(18b)

where $\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i}\in\mathbb{R}^{\bar{N}_{t}\times r_{i}}$ , $\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i}\in\mathbb{R}^{\bar{N}_{t}\times g_{i}}$ , $\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{\widehat{N}_{t}\times g_{i}}$ , $\bm{\widehat{Y}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{|\bar{\mathcal{I}}_{t+1}|\times g_{i}}$ , and $\bm{\widehat{Z}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{|\widehat{\mathcal{I}}_{t+1}|\times g_{i}}$ . The decision rules for the continuous variables in (18a) can form continuous or discontinuous piecewise linear functions of $\bm{\zeta}_{[t]}$ . With the given domains for $\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i}$ , $\bm{\widehat{Y}}_{tt^{\prime}i}$ , and $\bm{\widehat{Z}}_{tt^{\prime}i}$ , a variable following a decision rule of the form (18b) is guaranteed to be binary, i.e. can only take the value 0 or 1, if we also constrain it to be in $[0,1]$ ¹¹. This property implies that the integrality constraints on the binary variables can be relaxed when these decision rules are applied.

By applying the proposed decision rules in conjunction with the uncertainty set in (17), decision-dependent nonanticipativity is captured, and we arrive at the following approximation of the multistage robust optimization problem under endogenous uncertainty:

$\displaystyle\operatorname*{minimize}_{\begin{subarray}{c}\bm{\overline{X}}^{\mathrm{c}},\,\bm{\widehat{X}}^{\mathrm{c}},\\ \bm{\widehat{X}}^{\mathrm{b}},\,\bm{\widehat{Y}},\,\bm{\widehat{Z}}\end{subarray}}$	$\displaystyle\bm{e}_{1}^{\top}\left(\bm{\overline{X}}^{\mathrm{c}}_{111}+\bm{\widehat{X}}^{\mathrm{c}}_{111}\right)$	(19)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i},\bm{\widehat{Y}}_{tt^{\prime}i},\bm{\widehat{Z}}_{tt^{\prime}i}\right)\in\Omega_{tt^{\prime}i}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}}$
	$\displaystyle\left.\begin{array}[]{l}\sum\limits_{t^{\prime}=1}^{t}\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\left[\bm{A}^{\mathrm{c}}_{tt^{\prime}}\left(\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{\zeta}}_{t^{\prime\prime}i}+\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right)+\bm{A}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right.\\ \quad\quad\quad\quad\quad\quad\quad+\left.\bm{D}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}+\bm{H}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right]\leq\bm{B}_{t}\bm{\xi}_{[t]}\\[8.0pt] \bm{0}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\leq\bm{e}\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \bm{0}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\leq\bm{e}\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \bm{0}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\leq\bm{e}\quad\forall\,t^{\prime}=1,\dots,t\end{array}\right\}$
	$\displaystyle\quad\quad\quad\quad\quad\forall\,t\in\mathcal{T},\,(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\widetilde{\Theta}_{t}\left(\bm{\overline{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{b}}_{[t-1]},\bm{\widehat{Y}}_{[t-1]},\bm{\widehat{Z}}_{[t-1]}\right),$

where

\Omega_{tt^{\prime}i}=\left\{\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i},\bm{\widehat{Y}}_{tt^{\prime}i},\bm{\widehat{Z}}_{tt^{\prime}i}\right):\begin{array}[]{l}\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i}\in\mathbb{R}^{\bar{N}_{t}\times r_{i}}\\[4.0pt] \bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i}\in\mathbb{R}^{\bar{N}_{t}\times g_{i}}\\[4.0pt] \bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{\widehat{N}_{t}\times g_{i}}\\[4.0pt] \bm{\widehat{Y}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{|\bar{\mathcal{I}}_{t+1}|\times g_{i}}\\[4.0pt] \bm{\widehat{Z}}_{tt^{\prime}i}\in\{-1,\,0,\,1\}^{|\widehat{\mathcal{I}}_{t+1}|\times g_{i}}\end{array}\right\},

and $\bm{A}^{\mathrm{c}}_{tt^{\prime}}$ and $\bm{A}^{\mathrm{b}}_{tt^{\prime}}$ are appropriately defined constraint matrices associated with the corresponding continuous and binary $\bm{x}$ -variables, respectively. The uncertainty set is

		$\displaystyle\widetilde{\Theta}_{t}\left(\bm{\overline{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{b}}_{[t-1]},\bm{\widehat{Y}}_{[t-1]},\bm{\widehat{Z}}_{[t-1]}\right)$
		$\displaystyle\quad\quad\quad=\left\{\{\bm{\xi}_{[t]}\}\times\prod_{t^{\prime}=1}^{t}\prod_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}Z_{t^{\prime}i}\right\}\bigcap\,\widehat{\Theta}_{t}\left(\bm{\overline{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{b}}_{[t-1]},\bm{\widehat{Y}}_{[t-1]},\bm{\widehat{Z}}_{[t-1]}\right)$

with

	$\displaystyle\widehat{\Theta}_{t}$	$\displaystyle=\left\{(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\mathbb{R}^{\widetilde{K}_{[t]}}:\sum_{t^{\prime}=1}^{t}\left[\bm{W}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}+\bm{\overline{W}}_{tt^{\prime}}\bm{\zeta}_{t^{\prime}}\right]\right.$
		$\displaystyle\quad\quad\leq\left.\sum\limits_{t^{\prime}=1}^{t-1}\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\left[\bm{U}^{\mathrm{c}}_{tt^{\prime}}\left(\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{\zeta}}_{t^{\prime\prime}i}+\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right)+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}+\bm{\overline{U}}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}+\bm{\widehat{U}}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right]\right\}.$

For ease of exposition, we define new constraint and right-hand-side matrices such that we can express problem (19) in the following compact form:

$\displaystyle\operatorname*{minimize}_{\begin{subarray}{c}\bm{\overline{X}}^{\mathrm{c}},\,\bm{\widehat{X}}^{\mathrm{c}},\\ \bm{\widehat{X}}^{\mathrm{b}},\,\bm{\widehat{Y}},\,\bm{\widehat{Z}}\end{subarray}}$	$\displaystyle\bm{e}_{1}^{\top}\left(\bm{\overline{X}}^{\mathrm{c}}_{111}+\bm{\widehat{X}}^{\mathrm{c}}_{111}\right)$	(20)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle\sum\limits_{t^{\prime}=1}^{t}\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\sum\limits_{i\in\widehat{\mathcal{I}}_{t^{\prime\prime}}}\left[\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{\zeta}}_{t^{\prime\prime}i}+\left(\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{A}}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{D}}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{H}}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\right)\bm{\hat{\zeta}}_{t^{\prime\prime}i}\right]$
	$\displaystyle\quad\quad\quad\leq\sum_{t^{\prime}=1}^{t}\bm{\widetilde{B}}_{tt^{\prime}}\bm{\xi}_{t^{\prime}}\quad\forall\,t\in\mathcal{T},\,(\bm{\xi}_{[t]},\bm{\zeta}_{[t]},\bm{\bar{\zeta}}_{[t]},\bm{\hat{\zeta}}_{[t]})\in\widetilde{\Theta}_{t}\left(\bm{\overline{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{c}}_{[t-1]},\bm{\widehat{X}}^{\mathrm{b}}_{[t-1]},\bm{\widehat{Y}}_{[t-1]},\bm{\widehat{Z}}_{[t-1]}\right)$
	$\displaystyle\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i},\bm{\widehat{Y}}_{tt^{\prime}i},\bm{\widehat{Z}}_{tt^{\prime}i}\right)\in\Omega_{tt^{\prime}i}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}},$

where $\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times\bar{N}_{t^{\prime}}}$ , $\bm{\widetilde{A}}^{\mathrm{b}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times\widehat{N}_{t^{\prime}}}$ , $\bm{\widetilde{D}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times|\bar{\mathcal{I}}_{t^{\prime}+1}|}$ , $\bm{\widetilde{H}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times|\widehat{\mathcal{I}}_{t^{\prime}+1}|}$ , and $\bm{\widetilde{B}}_{tt^{\prime}}\in\mathbb{R}^{P_{t}\times K_{t^{\prime}}}$ .

6.4 Reformulation

Applying standard robust optimization techniques based on constraint-wise worst-case reformulation and linear programming (LP) duality ⁵⁶, the semi-infinite program (20) can be reformulated into the following problem:

$\displaystyle\operatorname*{minimize}_{\begin{subarray}{c}\bm{\overline{X}}^{\mathrm{c}},\,\bm{\widehat{X}}^{\mathrm{c}},\,\bm{\widehat{X}}^{\mathrm{b}},\\ \bm{\widehat{Y}},\,\bm{\widehat{Z}},\,\bm{\Phi},\,\bm{\Psi}\end{subarray}}$	$\displaystyle\bm{e}_{1}^{\top}\left(\bm{\overline{X}}^{\mathrm{c}}_{111}+\bm{\widehat{X}}^{\mathrm{c}}_{111}\right)$	(21)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle\sum_{t^{\prime}=1}^{t}\sum_{i\in\widehat{\mathcal{I}}_{t^{\prime}}}\bm{\psi}_{tt^{\prime}i}\leq\bm{0}\quad\forall\,t\in\mathcal{T}$
	$\displaystyle\bm{\Phi}_{t}\bm{w}_{tt^{\prime}i}=-\bm{\tilde{b}}_{tt^{\prime}i}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widetilde{\mathcal{I}}_{t^{\prime}}$
	$\displaystyle\bm{\psi}_{tt^{\prime\prime}i}+\bm{\Phi}_{t}\left[\bm{\bar{w}}_{tt^{\prime\prime}i}v_{t^{\prime\prime}i}-\sum\limits_{t^{\prime}=t^{\prime\prime}}^{t-1}\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{v}}_{t^{\prime\prime}i}\right.$
	$\displaystyle\quad\quad\quad\quad\quad\quad-\left.\sum\limits_{t^{\prime}=t^{\prime\prime}}^{t-1}\left(\bm{U}^{\mathrm{c}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}+\bm{U}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}+\bm{\overline{U}}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widehat{U}}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\right)\bm{\hat{v}}_{t^{\prime\prime}i}\right]$
	$\displaystyle\quad\quad\geq\sum\limits_{t^{\prime}=t^{\prime\prime}}^{t}\left[\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\bm{\overline{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}\bm{\bar{v}}_{t^{\prime\prime}i}+\left(\bm{\widetilde{A}}^{\mathrm{c}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{c}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{A}}^{\mathrm{b}}_{tt^{\prime}}\bm{\widehat{X}}^{\mathrm{b}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{D}}_{tt^{\prime}}\bm{\widehat{Y}}_{t^{\prime}t^{\prime\prime}i}+\bm{\widetilde{H}}_{tt^{\prime}}\bm{\widehat{Z}}_{t^{\prime}t^{\prime\prime}i}\right)\bm{\hat{v}}_{t^{\prime\prime}i}\right]$
	$\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\forall\,t\in\mathcal{T},\,t^{\prime\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime\prime}},\,\bm{\tilde{v}}_{t^{\prime\prime}i}\in\widetilde{\mathcal{V}}_{t^{\prime\prime}i}$
	$\displaystyle\bm{\psi}_{tt^{\prime}i}\in\mathbb{R}^{P_{t}}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}}$
	$\displaystyle\bm{\Phi}_{t}\in\mathbb{R}_{+}^{P_{t}\times Q_{t}}\quad\forall\,t\in\mathcal{T}$
	$\displaystyle\left(\bm{\overline{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{c}}_{tt^{\prime}i},\bm{\widehat{X}}^{\mathrm{b}}_{tt^{\prime}i},\bm{\widehat{Y}}_{tt^{\prime}i},\bm{\widehat{Z}}_{tt^{\prime}i}\right)\in\Omega_{tt^{\prime}i}\quad\forall\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t,\,i\in\widehat{\mathcal{I}}_{t^{\prime}}$

where $\bm{w}_{tt^{\prime}i}$ , $\bm{\bar{w}}_{tt^{\prime}i}$ $\bm{\tilde{b}}_{tt^{\prime}i}$ , and $\bm{\psi}_{ti}$ denote the columns of $\bm{W}_{tt^{\prime}}$ , $\bm{\overline{W}}_{tt^{\prime}}$ , $\bm{\widetilde{B}}_{tt^{\prime}}$ , and $\bm{\Psi}_{t}$ , respectively, corresponding to uncertain parameter $i$ . The detailed derivation of the reformulation can be found in the supplementary material.

Problem (21) is generally a mixed-integer quadratically constrained program (MIQCP) as it involves bilinear terms. However, if the endogenous uncertainty set only depends on binary variables, the problem can be reformulated into a mixed-integer linear program (MILP) by expressing each integer variable as the difference between two binary variables and linearizing the bilinear terms, which will all be products of a continuous and a binary variable (for more details, see Feng et al. (2020)).

7 Computational Case Studies

In this section, we present the results from four computational case studies that demonstrate the versatility of the proposed modeling framework. All model instances were implemented in Julia v1.3.1 using the modeling environment JuMP v0.21.2 ⁵⁷ and solved using Gurobi v8.1.1 on a Intel Core i7-8700 CPU at 3.20 GHz machine with 8 GB RAM.

7.1 Pilot Plant and Plant Redesign

We first consider an illustrative example that is based on Example 1. In this problem, we have a design of an industrial-scale plant whose capacity $q$ is uncertain and only becomes known after the plant is built. However, as described in Example 1, a pilot plant can be built such that its capacity $p$ helps provide a more accurate prediction of $q$ . Once $p$ is observed, the marginal uncertainty set for $q$ is updated, based on which we can decide to build one or two plants using the current design or redesign the plant and build a plant using the new design. The solution has to ensure that a given demand $d$ can be met, and the objective is to optimize the worst case. The associated robust optimization problem can be formulated as follows:

$\displaystyle\operatorname*{minimize}_{x,y,z_{1},z_{2}}$	$\displaystyle\delta x+\max_{(p,q,\gamma)\in\Xi(x)}\gamma y(xp)+\rho\left[z_{1}(xp)+z_{2}(xp)\right]$	(22)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle x\in\{0,1\}$
	$\displaystyle\left.\begin{array}[]{l}y(xp)\leq x\\[4.0pt] d\leq q\left[z_{1}(xp)+z_{2}(xp)\right]+dy(xp)\\[4.0pt] y(xp),\,z_{1}(xp),\,z_{2}(xp)\in\{0,1\}\end{array}\right\}\quad\forall\,(p,q,\gamma)\in\Xi(x),$

where the binary variable $x$ equals 1 if the pilot plant is built and 0 otherwise, $y$ equals 1 if a plant with a new design and capacity $d$ is built, and $z_{1}$ and $z_{2}$ indicate how many (maximum two) plants with the current design are built. Nonnegative cost coefficients in the cost function are denoted by $\delta$ , $\gamma$ , and $\rho$ , where $\gamma$ is an uncertain parameter as it depends on the pilot plant capacity. As indicated by the constraint $y(xp)\leq x$ , a plant redesign is only possible if a pilot plant is built and hence more process knowledge is available. Finally, the total plant capacity has to be greater than or equal to demand $d$ . The uncertainty set depends on $x$ and can be expressed as follows:

\Xi(x)=\left\{(p,q,\gamma)\in\mathbb{R}_{+}^{3}:\begin{array}[]{l}p^{\min}x\leq p\leq p^{\max}x\\[4.0pt] q^{\min}(1-x)+(\alpha-\Delta)x+\beta p\leq q\leq q^{\max}(1-x)+(\alpha+\Delta)x+\beta p\\[4.0pt] \gamma=(\rho+\tilde{\gamma}p^{\max})x-\tilde{\gamma}p\end{array}\right\},

where $p$ and $\gamma$ can only take nonzero values if $x=1$ . Similarly, $(\alpha-\Delta)+\beta p\leq q\leq(\alpha+\Delta)+\beta p$ if $x=1$ , and $q^{\min}\leq q\leq q^{\max}$ if $x=0$ . The parameters are given such that $q^{\min}=(\alpha-\Delta)+\beta p^{\min}$ and $q^{\max}=(\alpha+\Delta)+\beta p^{\max}$ . We further assume that $q^{\min}<d\leq 2q^{\min}$ and that there exists a $\hat{p}<p^{\max}$ such that $d=(\alpha-\Delta)+\beta\hat{p}$ .

7.1.1 Analytical Solution

Problem (22) can be solved analytically, and the solution is illustrated in the decision diagram shown in Figure 4. If we choose to build a pilot plant ( $x=1$ ), we can take recourse actions ( $y$ , $z_{1}$ , and $z_{2}$ ) based on the realization of $p$ . If $p\geq\hat{p}$ , we know that $q\geq d$ ; hence, we only have to build one plant using the current design. If $p<\hat{p}$ , building one plant with the current design may not be sufficient to meet demand $d$ . In this case, we choose the less expensive option between building one plant with a new design that achieves a capacity of $d$ and building two plants with the current design. The costs for these two choices are $v=\delta+\rho+\tilde{\gamma}(p^{\max}-p)$ and $v=\delta+2\rho$ , respectively. Thus, the first option is less expensive than the second if $\tilde{\gamma}(p^{\max}-p)<\rho$ . In contrast, if we do not build a pilot plant ( $x=0$ ), there will be no recourse; in that case, in order to guarantee feasibility for every possible realization of $q$ , we have to build two plants using the current design, which results in a cost of $v=2\rho$ .

The problem further simplifies since the objective is to minimize the worst-case cost and the worst case is already known a priori, namely $p=p^{\min}$ , $q=q^{\min}$ , and $\gamma=\rho+\tilde{\gamma}(p^{\max}-p^{\min})$ . Therefore, the optimal value of (22) is simply

v^{*}=\min\left\{2\rho,\,\delta+\rho+\tilde{\gamma}(p^{\max}-p^{\min})\right\},

and the corresponding optimal solution is as follows: If $\rho<\delta+\tilde{\gamma}(p^{\max}-p^{\min})$ , $x^{*}=y^{*}=0$ and $z_{1}^{*}=z_{2}^{*}=1$ ; if $\rho>\delta+\tilde{\gamma}(p^{\max}-p^{\min})$ , $x^{*}=y^{*}=1$ and $z_{1}^{*}=z_{2}^{*}=0$ . Both solutions are optimal if $\rho=\delta+\tilde{\gamma}(p^{\max}-p^{\min})$ .

For our computational case study, we choose $\rho=100$ , $d=59$ , $p^{\min}=5$ , $p^{\max}=10$ , $q^{\min}=35$ , $q^{\max}=75$ , $\alpha=10$ , $\beta=6$ , and $\Delta=5$ . The heat map in Figure 5a shows $v^{*}$ as a function of $\delta$ and $\tilde{\gamma}$ . The red line is given by $\delta=\rho-(p^{\max}-p^{\min})\tilde{\gamma}=100-5\tilde{\gamma}$ , which indicates the boundary at which $v^{*}$ switches from $\delta+\rho+\tilde{\gamma}(p^{\max}-p^{\min})$ to $2\rho$ .

7.1.2 Non-Worst-Case Objective Function

As mentioned above, in this example, recourse is only possible if $x=1$ , which essentially means that this decision determines whether the problem is static or two-stage. However, note that recourse is not required in the optimal solution of (22), which is due to the fact that worst-case cost minimization is considered and that the worst case is known a priori. In the following, we consider an alternative, scenario-based objective function, which is often a better choice in practice. Here, we introduce a discrete scenario set $\mathcal{S}$ , which contains possible realizations of $p$ , each denoted by $\bar{p}_{s}$ with probability $\phi_{s}$ , where $\sum_{s\in\mathcal{S}}\phi_{s}=1$ . We then replace the objective function in problem (22) with

\delta x+\sum_{s\in\mathcal{S}}\phi_{s}\left(\bar{\gamma}_{s}y(x\bar{p}_{s})+\rho\left[z_{1}(x\bar{p}_{s})+z_{2}(x\bar{p}_{s})\right]\right)

(23)

with $\bar{\gamma}_{s}=\rho+\tilde{\gamma}(p^{\max}-\bar{p}_{s})$ . This objective function resembles a sample average approximation of the expected cost as commonly used in stochastic programming. We further reformulate the constraints to eliminate random recourse and remove $\gamma$ from the uncertainty set, arriving at the following formulation:

$\displaystyle\operatorname*{minimize}_{x,y,z_{1},z_{2},w_{1},w_{2}}$	$\displaystyle\delta x+\sum_{s\in\mathcal{S}}\phi_{s}\left(\bar{\gamma}_{s}y(x\bar{p}_{s})+\rho\left[z_{1}(x\bar{p}_{s})+z_{2}(x\bar{p}_{s})\right]\right)$	(24)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle x\in\{0,1\}$
	$\displaystyle\left.\begin{array}[]{l}y(xp)\leq x\\[4.0pt] w_{1}(xp)\leq q\\[4.0pt] w_{1}(xp)\leq q^{\max}z_{1}(xp)\\[4.0pt] w_{2}(xp)\leq q\\[4.0pt] w_{2}(xp)\leq q^{\max}z_{2}(xp)\\[4.0pt] d\leq w_{1}(xp)+w_{2}(xp)+dy(xp)\\[4.0pt] y(xp),\,z_{1}(xp),\,z_{2}(xp)\in\{0,1\}\\ w_{1}(xp),\,w_{2}(xp)\in\mathbb{R}_{+}\end{array}\right\}\quad\forall\,(p,q)\in\overline{\Xi}(x)$

with

\small\overline{\Xi}(x)=\left\{(p,q)\in\mathbb{R}_{+}^{2}:\begin{array}[]{l}p^{\min}x\leq p\leq p^{\max}x\\[4.0pt] q^{\min}(1-x)+(\alpha-\Delta)x+\beta p\leq q\leq q^{\max}(1-x)+(\alpha+\Delta)x+\beta p\end{array}\right\}.

Note that we have introduced two new continuous recourse variables, $w_{1}$ and $w_{2}$ , which represent the production amounts of plants built with the current design.

The optimal solution to this modified problem is less trivial as now recourse decisions, as depicted in Figure 4, play a crucial role. Still, we can solve it analytically. If we choose to build a pilot plant, the optimal recourse decisions are just as described in Section 7.1.1. As a result, the optimal value is

v^{*}=\min\left\{2\rho,\,\delta+\sum_{s\in\mathcal{S}_{1}}\phi_{s}\rho+\sum_{s\in\mathcal{S}_{2}}\phi_{s}[\rho+\tilde{\gamma}(p^{\max}-\bar{p}_{s})]+\sum_{s\in\mathcal{S}_{3}}2\phi_{s}\rho\right\},

where $\mathcal{S}_{1}:=\{s\in\mathcal{S}:\bar{p}_{s}\geq\hat{p}\}$ , $\mathcal{S}_{2}:=\{s\in\mathcal{S}:\bar{p}_{s}<\hat{p},\,\tilde{\gamma}(p^{\max}-\bar{p}_{s})<\rho\}$ , and $\mathcal{S}_{3}:=\{s\in\mathcal{S}:\bar{p}_{s}<\hat{p},\,\tilde{\gamma}(p^{\max}-\bar{p}_{s})\geq\rho\}$ such that $\mathcal{S}=\mathcal{S}_{1}\cup\mathcal{S}_{2}\cup\mathcal{S}_{3}$ .

We can also apply the proposed decision rule approach to solve this problem. To obtain the exact optimal solution, the marginal support of $p$ needs to be partitioned using appropriate breakpoints. Depending on the parameter values, we require at least one or two breakpoints. If $\tilde{\gamma}(p^{\max}-p^{\min})\leq\rho$ or $\tilde{\gamma}(p^{\max}-\hat{p})\geq\rho$ , we only need one breakpoint at $\hat{p}$ . If $\tilde{\gamma}(p^{\max}-p^{\min})>\rho$ and $\tilde{\gamma}(p^{\max}-\hat{p})<\rho$ , we need two breakpoints, one at $p^{\max}-\rho/\tilde{\gamma}$ and one at $\hat{p}$ . Note that the same decision rules are used across all scenarios in $\mathcal{S}$ ; as a result, compared to the worst-case formulation, no additional variables are added to the model such that a comparable computational complexity is achieved. We solve the problem using 100 scenarios sampled from a uniform distribution supported on the given uncertainty set, and obtain the heat map shown in Figure 5b.

7.2 Maintenance Planning with Inspections

This example involves type-1, type-2a, and type-2b endogenous uncertainty. Here, we consider integrated production and maintenance planning with equipment degradation. We use the remaining useful life (RUL) as a measure of equipment health, which decreases with continued operation. To avoid equipment failure, the RUL has to remain greater than zero and can be restored through maintenance. Investing in an equipment upgrade, which would reduce the degradation rate, is also considered. However, degradation processes are inherently stochastic, which renders the change in RUL an uncertain parameter. Furthermore, RUL is an abstract quantity that is often difficult to estimate from operational process data such that elaborate inspections are required to obtain reliable equipment health information. Therefore, in this example, we assume that the RUL can only be observed through inspections, and formulate the resulting multistage robust optimization problem as follows:

$\displaystyle\operatorname*{minimize}_{x,\bm{y},\bm{p},\bm{s},\bm{r},\bm{m},\bm{f},\bm{z}}$	$\displaystyle\alpha x+\max_{(\bm{\xi},\bm{\bar{\xi}},\bm{\zeta})\in\Theta_{T}(x,\bm{y},\bm{z})}\sum_{t\in\mathcal{T}}\left(\beta y_{t}+\gamma p_{t}+\delta z_{t}+\rho m_{t}-\varphi f_{t}\right)$	(25)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle x\in\{0,1\}$
	$\displaystyle\left.\begin{array}[]{l}s_{t}=s_{0}+\sum\limits_{t^{\prime}=1}^{t}(p_{t^{\prime}}-d_{t^{\prime}})\leq s^{\max}\\[4.0pt] p_{t}\leq Cy_{t}\\[4.0pt] r_{t}=r_{0}-\sum\limits_{t^{\prime}=1}^{t}\xi_{t^{\prime}}+\sum\limits_{t^{\prime}=1}^{t}f_{t^{\prime}}\leq r^{\max}\\[4.0pt] f_{t}\leq r^{\max}m_{t}\\[4.0pt] y_{t}+m_{t}\leq 1\\[4.0pt] m_{t}\leq z_{t-1}\\[4.0pt] p_{t},\,s_{t},\,r_{t},\,f_{t}\in\mathbb{R}_{+},\quad y_{t},\,z_{t},\,m_{t}\in\{0,1\}\end{array}\right\}\begin{array}[]{l}\forall\,t\in\mathcal{T},\\[4.0pt] (\bm{\xi}_{[t]},\bm{\bar{\xi}}_{[t]},\bm{\zeta}_{[t]})\in\Theta_{t}(x,\bm{y}_{[t]},\bm{z}_{[t]}),\end{array}$

where $x$ equals 1 if an equipment upgrade is performed, $y_{t}$ equals 1 if the plant operates in time period $t$ , $p_{t}$ is the amount produced in time period $t$ , $s_{t}$ is the inventory level at time $t$ (end of time period $t$ ), $r_{t}$ is the RUL at time $t$ , $m_{t}$ equals 1 if maintenance is performed in time period $t$ , $f_{t}$ is the amount of RUL restored in time period $t$ , and $z_{t}$ equals 1 if an inspection is conducted at time $t$ . The recourse variables are functions of uncertain parameters; however, for the sake of brevity, we do not explicitly indicate this in (25). With given positive cost coefficients $\alpha$ , $\beta$ , $\gamma$ , $\delta$ , $\rho$ , and $\varphi$ , the objective is to minimize the worst-case cost over the planning horizon given by the set of time periods $\mathcal{T}=\{1,\dots,T\}$ . Note that a negative cost is assigned to $f_{t}$ in order to encourage maintenance toward the end of the equipment’s residual life and to retain a reasonable RUL value at the end of the planning horizon. Given an initial inventory $s_{0}$ and demand $d_{t}$ for every time period $t$ , the second constraint in (25) expresses material balance and sets an upper bound $s^{\max}$ on the inventory. The next constraint states that the production amount is bounded by the plant capacity $C$ . Assuming the initial RUL $r_{0}$ is known, $r_{t}$ is computed and bounded in the next constraint. The RUL reduction caused by operation in time period $t$ is denoted by $\xi_{t}$ , which is uncertain. If maintenance is performed, the RUL can be restored to its maximum $r^{\max}$ , as stated in the next inequality. Finally, the last two constraints state that maintenance can only be performed if the plant is shut down and needs to be immediately preceded by an inspection. We further assume that $z_{0}=m_{1}=f_{1}=0$ .

The decision-dependent uncertainty set, which includes auxiliary uncertain parameters, is

\Theta_{t}(x,\bm{y}_{[t]},\bm{z}_{[t]})=\left\{(\bm{\xi}_{[t]},\bm{\bar{\xi}}_{[t]},\bm{\zeta}_{[t]})\in\mathbb{R}^{3t}_{+}:\begin{array}[]{l}\bar{\xi}_{t^{\prime}}=\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\xi_{t^{\prime\prime}}\quad\forall\,t^{\prime}=1,\dots,t\\[8.0pt] \xi^{\min}y_{t^{\prime}}\leq\xi_{t^{\prime}}\leq\xi^{\max}y_{t^{\prime}}\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \xi_{t^{\prime}}\leq\xi^{\max}-\xi^{\Delta}x\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \bar{\xi}_{t^{\prime}}-t^{\prime}\xi^{\max}(1-z_{t^{\prime}})\leq\zeta_{t^{\prime}}\leq\bar{\xi}_{t^{\prime}}\quad\forall\,t^{\prime}=1,\dots,t\\[4.0pt] \zeta_{t^{\prime}}\leq\bar{\xi}^{\max}_{t^{\prime}}z_{t^{\prime}}\quad\forall\,t^{\prime}=1,\dots,t\end{array}\right\},

where the auxiliary uncertain parameter $\bar{\xi}_{t}$ is the cumulative RUL reduction up to time $t$ . The reason for introducing $\bar{\xi}_{t}$ is that an inspection at time $t$ reveals $\bar{\xi}_{t}$ but not each individual $\xi_{t^{\prime}}$ that has contributed to the cumulative RUL reduction since the last inspection. The RUL reduction in time period $t$ , $\xi_{t}$ , is zero if the plant does not operate, i.e. $y_{t}=0$ , and bounded by $\xi^{\min}$ and $\xi^{\max}$ otherwise. This upper bound is further reduced by $\xi^{\Delta}$ if an equipment upgrade is performed, i.e. $x=1$ . We have another auxiliary variable $\zeta_{t}$ for every $t\in\mathcal{T}$ , which takes the value of $\bar{\xi}_{t}$ if it is observed through an inspection at time $t$ , i.e. $z_{t}=1$ , and is zero otherwise.

Note that $\mathcal{T}$ denotes the set of time periods, not the set of stages. In fact, strictly speaking, the number of stages depends on the solution and is directly related to the number of inspections. The maximum possible number of stages is $T$ , which is only achieved if inspection is performed at every $t\in\mathcal{T}$ . In that case, after eliminating the state variables $s_{t}$ and $r_{t}$ , the first-stage decisions are $x$ , $y_{1}$ , $p_{1}$ , and $z_{1}$ . Then, $\bar{\xi}_{1}$ is observed and the second-stage decisions are $y_{2}$ , $p_{2}$ , $m_{2}$ , $f_{2}$ , and $z_{2}$ ; this continues similarly up to stage $T$ . The number of stages is implicitly encoded in the recourse variables that are functions of only observed uncertain parameters. Specifically, for all $t\in\mathcal{T}$ , $y_{t}=y_{t}(\bm{\zeta}_{[t-1]})$ , $p_{t}=p_{t}(\bm{\zeta}_{[t-1]})$ , $m_{t}=m_{t}(\bm{\zeta}_{[t-1]})$ , $f_{t}=f_{t}(\bm{\zeta}_{[t-1]})$ , and $z_{t}=z_{t}(\bm{\zeta}_{[t-1]})$ .

We conduct a case study with the following data: $T=7$ , $\beta=\$\,2\times 10^{3}$ , $\gamma=\$\,400$ , $\delta=\$\,10^{3}$ , $\rho=\$\,2.5\times 10^{5}$ , $\varphi=\$\,2\times 10^{3}$ , $s_{0}=10$ , $s^{\max}=50$ , $C=90$ , $r_{0}=40$ , $r^{\max}=125$ , $d_{t}=50$ for all $t\in\mathcal{T}$ , $\xi^{\min}=4$ , $\xi^{\max}=25$ , and $\xi^{\Delta}=13$ . Two cases are considered: Case 1, in which the equipment upgrade cost $\alpha$ is $\$\,5\times 10^{4}$ , and Case 2 with $\alpha$ being $\$\,10^{4}$ . For each case, we solve one instance using the above ARO model with one breakpoint placed at 8 for each $\xi_{t}$ , and another instance that disallows recourse, resulting in a static problem (SRO). The optimal values and solution times for all four instances are shown in Table 2. In both Cases 1 and 2, one can observe a clear benefit from recourse, which is reflected in an improvement in the optimal value of about 3 %. However, the computational cost for solving the ARO model is significantly higher due to the larger model size. Specifically, the ARO model has 218 binary variables, 320,150 continuous variables, and 952,374 constraints after Gurobi’s preprocessing.

Table 2: Computational results for instances of the maintenance planning problem.

	Cost of upgrade (k$)	Model	Optimal value (k$)	Solution time (s)
Case 1	50	SRO	219	425
Case 1	50	ARO	212	13,621
Case 2	10	SRO	213	382
Case 2	10	ARO	206	7,837

The results in Table 2 also indicate that an equipment upgrade can be beneficial if the cost is sufficiently low as the solution suggests investing in an equipment upgrade in Case 2 but not in Case 1. Figure 6 further shows how the equipment upgrade affects the optimal schedule. As depicted in Figure 6a, without equipment upgrade, inspection is performed at time 1. Then, depending on the realization of $\bar{\xi}_{1}$ , the top or bottom schedule is chosen. If $\bar{\xi}_{1}\geq 8$ , maintenance is performed in time period 2, immediately after the inspection; otherwise, maintenance is postponed to time period 3. In contrast, the solution for Case 2, as shown in Figure 6b, suggests inspecting the equipment at time 2. This later time point is made possible by the equipment upgrade that reduces the maximum degradation rate. Then, if $\bar{\xi}_{2}\geq 16$ , maintenance is performed in time period 3; otherwise, maintenance is scheduled for time period 6.

7.3 Optimizing Revision Points in Capacity Planning

The previous example considers a problem in which the number of stages is affected by decisions due to the decision-dependent observation of uncertain parameters. There are other situations in which uncertain parameters are observed over time, independent of decisions, yet we still would like to restrict or optimize the number of stages. One example is long-term capacity planning, which involves large investments that require commitments and lead time for establishing contracts and infrastructure. Hence, it is often infeasible to change decisions frequently despite new information being constantly revealed over time. Instead, one has to restrict oneself to a few revision points at which the planning decisions are revised and updated. In their recent work, Basciftci et al. ⁵⁸ have formally introduced an adaptive two-stage stochastic programming approach in which for each decision policy, one revision point is considered and chosen as part of the optimization problem. This problem can be viewed as a stochastic program with type-2b endogenous uncertainty where planning decisions for time periods before the revision point have to be made here and now while decisions after the revision point can be adjusted based on the uncertainty revealed up to the revision point. As a result, our framework can be directly applied to solve a robust optimization variant of the problem and extensions that consider multiple revision points.

Consider a simplified power generation capacity expansion problem with $T$ time periods and a set of types of generation resources $\mathcal{I}$ . At the beginning of each time period $t$ (i.e. at time $t-1$ ), the uncertain electricity demand for the same time period, $d_{t}$ , is revealed and the amount of electricity generated from each generation type $i$ in time period $t$ , $y_{it}$ , is determined. Capacity expansion decisions, however, can only be made or updated at revision points. Let $x_{it}$ denote the amount of generation capacity of type $i$ that is added in time period $t$ and becomes available at time $t$ . Assume that for each generation type $i$ , there are two revision points, one at time $t^{\prime}_{i}-1$ , immediately after $d_{t^{\prime}_{i}}$ is revealed, and the other one at time $t^{\prime\prime}_{i}-1$ with $t^{\prime}_{i}<t^{\prime\prime}_{i}$ . Then, all $x_{it}$ with $t<t^{\prime}_{i}$ have to be chosen before any demand for the planning horizon is known, all $x_{it}$ with $t^{\prime}_{i}\leq t<t^{\prime\prime}_{i}$ are chosen at time $t^{\prime}_{i}-1$ with known $\bm{d}_{[t^{\prime}_{i}]}$ , and all $x_{it}$ with $t\geq t^{\prime\prime}_{i}$ are chosen at time $t^{\prime\prime}_{i}-1$ with known $\bm{d}_{[t^{\prime\prime}_{i}]}$ . Given a maximum number of revision points for each generation type $i$ , denoted by $n_{i}$ , we aim to place the revision points such that a given cost function, in this case a sample average approximation of the expected cost, is minimized.

We introduce binary variables $\hat{z}_{it}$ , which equals 1 if capacity expansion decisions for generation type $i$ are revised at time $t-1$ , and $z_{itt^{\prime}}$ , which equals 1 if $d_{t^{\prime}}$ is known when the decision $x_{it}$ is made. The problem can then be formulated as follows:


$\displaystyle\operatorname*{minimize}_{\bm{z},\bm{\hat{z}},\bm{x},\bm{y}}\quad$	$\displaystyle\sum_{i\in\mathcal{I}}\alpha_{i}x_{i0}+\sum_{s\in\mathcal{S}}\phi_{s}\sum_{t\in\mathcal{T}}\frac{1}{(1+r)^{t}}\left[\sum_{i\in\mathcal{I}}\left(\alpha_{i}\bar{x}_{its}+\beta_{i}\bar{y}_{its}\right)\right]$	(26a)
$\displaystyle\mathrm{subject\;to}\quad\;\;$	$\displaystyle\sum_{t\in\mathcal{T}}\hat{z}_{it}\leq n_{i}\quad\forall\,i\in\mathcal{I}$	(26b)
	$\displaystyle\frac{1}{n_{i}}\sum_{t^{\prime\prime}=t^{\prime}}^{t}\hat{z}_{it^{\prime\prime}}\leq z_{itt^{\prime}}\leq\sum_{t^{\prime\prime}=t^{\prime}}^{t}\hat{z}_{it^{\prime\prime}}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t$	(26c)
	$\displaystyle\hat{z}_{it}\in\{0,1\}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T}$	(26d)
	$\displaystyle z_{itt^{\prime}}\in\{0,1\}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t$	(26e)
	$\displaystyle x_{it}\leq C_{it}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T}$	(26f)
	$\displaystyle y_{it}\leq\sum\limits_{t^{\prime}=1}^{t}x_{i,t^{\prime}-1}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,(\bm{d}_{[t]},\bm{\zeta}_{[t]})\in\Theta_{t}(\bm{z}_{[t]})$	(26g)
	$\displaystyle\sum\limits_{i\in\mathcal{I}}y_{it}\geq d_{t}\quad\forall\,t\in\mathcal{T},\,(\bm{d}_{[t]},\bm{\zeta}_{[t]})\in\Theta_{t}(\bm{z}_{[t]})$	(26h)
	$\displaystyle x_{i0}\in\mathbb{R}_{+}\quad\forall\,i\in\mathcal{I}$	(26i)
	$\displaystyle y_{it},\,x_{it}\in\mathbb{R}_{+}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,(\bm{d}_{[t]},\bm{\zeta}_{[t]})\in\Theta_{t}(\bm{z}_{[t]}),$	(26j)

where $\mathcal{T}=\{1,\dots,T\}$ , $\mathcal{S}$ denotes the set of scenarios, $\alpha_{i}$ and $\beta_{i}$ are cost coefficients, and $r$ is a discount factor. Each scenario $s\in\mathcal{S}$ is characterized by a specific electricity demand profile $\bm{\bar{d}}_{[T],s}$ and probability $\phi_{s}$ . In the objective function (26a), $\bar{x}_{its}=x_{it}(\bm{z}_{it}\circ\bm{\bar{d}}_{[t],s})$ and $\bar{y}_{its}=y_{it}(\bm{\bar{d}}_{[t],s})$ represent the recourse decisions for specific scenarios. Constraints (26b) restrict the number of revision points. Inequalities (26c) force $z_{itt^{\prime}}$ to be 1 if there is at least one revision point between times $t^{\prime}-1$ and $t-1$ and 0 otherwise. The rationale is that if there is one or multiple revision points between $t^{\prime}-1$ and $t-1$ , then $x_{it}$ will be determined at the latest revision point within that time frame at which point $d_{t^{\prime}}$ will be known. Constraints (26f) set an upper bound, $C_{it}$ , on each $x_{it}$ . Capacity constraints are given in (26g), and constraints (26h) state that the total amount of electricity produced has to meet or exceed the demand. The decision-dependent uncertainty set is

\small\Theta_{t}(\bm{z}_{[t]})=\left\{(\bm{d}_{[t]},\bm{\zeta}_{[t]})\in\mathbb{R}_{+}^{K_{t}}:\begin{array}[]{l}d_{1}^{\min}\leq d_{1}\leq d_{1}^{\max}\\[4.0pt] \delta_{t^{\prime}}^{\min}d_{t^{\prime}-1}\leq d_{t^{\prime}}\leq\delta_{t^{\prime}}^{\max}d_{t^{\prime}-1}\quad\forall\,t^{\prime}=2,\dots,t\\ \sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}d_{t^{\prime\prime}}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\bar{\delta}^{t^{\prime\prime}-1}d_{1}\quad\forall\,t^{\prime}=1,\dots,t\\[8.0pt] d_{t^{\prime\prime}}-d^{\max}_{t^{\prime\prime}}(1-z_{it^{\prime}t^{\prime\prime}})\leq\zeta_{it^{\prime}t^{\prime\prime}}\leq d_{t^{\prime\prime}}\quad\forall\,i\in\mathcal{I},\,t^{\prime}=1,\dots,t,\,t^{\prime\prime}=1,\dots,t^{\prime}\\[4.0pt] d^{\min}_{t^{\prime\prime}}z_{it^{\prime}t^{\prime\prime}}\leq\zeta_{it^{\prime}t^{\prime\prime}}\leq d^{\max}_{t^{\prime\prime}}z_{it^{\prime}t^{\prime\prime}}\quad\forall\,i\in\mathcal{I},\,t^{\prime}=1,\dots,t,\,t^{\prime\prime}=1,\dots,t^{\prime}\end{array}\right\},

which encodes the correlation between the electricity demands in different time periods as the range in which $d_{t^{\prime}}$ can vary depends on $d_{t^{\prime}-1}$ and the budget of uncertainty imposed on $\sum_{t^{\prime\prime}=1}^{t^{\prime}}d_{t^{\prime\prime}}$ is $\sum_{t^{\prime\prime}=1}^{t^{\prime}}\bar{\delta}^{t^{\prime\prime}-1}d_{1}$ , which is a function of $d_{1}$ with a given parameter $\bar{\delta}$ . The auxiliary uncertain parameters $\bm{\zeta}_{[t]}$ depend on the choice of revision points and hence on $\bm{z}_{[t]}$ . Note that the vector $\bm{z}_{[t]}$ contains all $z_{it^{\prime}t^{\prime\prime}}$ with $i\in\mathcal{I}$ , $t^{\prime}=1,\dots,t$ , and $t^{\prime\prime}=1,\dots,t^{\prime}$ . Also, $K_{t}=t+|\mathcal{I}|t(t+1)/2$ . The recourse variables are adjustable given the following dependence on uncertain parameters: $x_{it}=x_{it}(\bm{\zeta}_{[t]})$ and $y_{it}=y_{it}(\bm{d}_{[t]})$ .

Notice that $\bm{z}$ and $\bm{\hat{z}}$ , which determine the revision points, are all first-stage variables. As a result, the decision-dependent nonanticipativity for the capacity expansion decisions $\bm{x}$ can alternatively be modeled by forcing the decision rule coefficients associated with unobserved uncertain parameters to zero (see Section 6.1). The decision rules with respect to the lifted uncertain parameters are

x_{it}=\sum_{t^{\prime}=0}^{t}\left(\bm{\bar{x}}_{itt^{\prime}}^{\top}\bm{\bar{d}}_{t^{\prime}}+\bm{\hat{x}}_{itt^{\prime}}^{\top}\bm{\hat{d}}_{t^{\prime}}\right)\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T}

with $\bar{d}_{0}=1$ and $\hat{d}_{0}=0$ . Decision-dependent nonanticipativity is then enforced by the following constraints:

	$\displaystyle-Mz_{itt^{\prime}}\bm{e}\leq\bm{\bar{x}}_{itt^{\prime}}\leq Mz_{itt^{\prime}}\bm{e}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t$
	$\displaystyle-Mz_{itt^{\prime}}\bm{e}\leq\bm{\hat{x}}_{itt^{\prime}}\leq Mz_{itt^{\prime}}\bm{e}\quad\forall\,i\in\mathcal{I},\,t\in\mathcal{T},\,t^{\prime}=1,\dots,t$

with $M$ being a sufficiently large parameter. This alternative approach allows us to work with a fixed uncertainty set:

\Theta_{t}=\left\{\bm{d}_{[t]}\in\mathbb{R}_{+}^{t}:\begin{array}[]{l}d_{1}^{\min}\leq d_{1}\leq d_{1}^{\max}\\[4.0pt] \delta_{t^{\prime}}^{\min}d_{t^{\prime}-1}\leq d_{t^{\prime}}\leq\delta_{t^{\prime}}^{\max}d_{t^{\prime}-1}\quad\forall\,t^{\prime}=2,\dots,t\\ \sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}d_{t^{\prime\prime}}\leq\sum\limits_{t^{\prime\prime}=1}^{t^{\prime}}\bar{\delta}^{t^{\prime\prime}-1}d_{1}\quad\forall\,t^{\prime}=1,\dots,t\end{array}\right\}.

Moreover, it does not require auxiliary uncertain parameters, which significantly reduces the computational complexity.

In the following case study, we consider a planning horizon of seven time periods and four generation types, namely coal, natural gas-gas turbine (NG-GT), solar PV, and wind. Cost data are adapted from Min et al. ⁵⁹, which, along with all other required data, can be found in the supplementary material. Furthermore, the objective function is constructed using 100 scenarios sampled from a uniform distribution supported on the given uncertainty set, and uncertainty lifting is performed with two equidistantly placed breakpoints for each uncertain parameter.

We solve the capacity planning problem for different maximum allowed number of revision points $\bar{n}$ , which is applied to all generation types, i.e. $n_{i}=\bar{n}$ for all $i\in\mathcal{I}$ . The results are shown in Table 3. As expected, the optimal value decreases with increasing number of revision points. The improvement is most significant from the static case (no revision) to the case with one revision point. Diminishing returns are observed at higher number of revision points. The results show that the locations of the revision points for different generation types are generally not the same. For larger $\bar{n}$ , there also seems to be no need to use the maximum number of revision points for all generation types. Finally, note that the solution times are moderate, indicating the potential to solve more complex problems of this kind using the proposed approach.

Table 3: Computational results for the capacity planning problem with different allowed numbers of revision points. The relative improvement is computed with respect to the optimal value of the case with no revision.

Allowed # of revision points	Optimal value (k$)	Solution time (s)	Relative improvement	Location of revision points
Allowed # of revision points	Optimal value (k$)	Solution time (s)	Relative improvement	Coal	NG-GT	Solar PV	Wind
0	46,183	9
1	40,852	427	11.54 %	3	5	1	2
2	39,526	1,679	14.42 %	3, 6	2, 5	1, 6	1, 3
3	38,955	764	15.65 %	2, 3, 6	1, 2, 5	1, 5, 6	1, 3, 5
4	38,612	874	16.39 %	2, 3, 5, 6	1, 2, 3, 5	1, 5, 6	1, 3, 5
5	38,423	592	16.80 %	1, 2, 3, 5, 6	1, 2, 3, 4, 5	5, 6	1, 2, 3, 5
6	38,423	462	16.80 %	1, 2, 3, 4, 5, 6	1, 2, 3, 4, 5	5, 6	1, 2, 3, 5

7.4 Production Scheduling with Active Parameter Estimation

When operating a manufacturing process, we often do not have accurate information across the whole range of feasible operation due to the lack of process data. A model that we use for process optimization may only be accurate around one or a few nominal points, especially if it is a relatively new process. For example, a production unit may be able to operate in different operating modes, but process parameters such as efficiency and processing time are only accurately known for the one operating mode that is used most of the time. The models of other, potentially better-performing operating modes can only be improved by actually operating in those modes and collecting more data; however, this exploration action comes with the risk of operating under conditions with inferior performance or damaging impact. In the following example, we consider process scheduling with active model parameter estimation, which integrates scheduling optimization and design of experiments in order to find an optimal trade-off between exploration and exploitation that maximizes the overall plant performance over time. The problem is formulated as follows:

$\displaystyle\operatorname*{minimize}_{\bm{x},\bm{s}}$	$\displaystyle\max_{\bm{p}\in\mathcal{P}_{T}(\bm{x})}\sum_{t\in\mathcal{T}}\sum_{i\in\mathcal{I}}\sum_{j\in\mathcal{J}_{i}}\bm{c}^{\top}\bm{u}_{ij}\,x_{tij}(\bm{p}_{[t]})+\delta s_{t}(\bm{p}_{[t]})$	(27)
$\displaystyle\mathrm{subject\;to}$	$\displaystyle s_{t}(\bm{p}_{[t]})=s_{0}+\sum_{t^{\prime}=1}^{t}\left(\sum_{i\in\mathcal{I}}\sum_{j\in\mathcal{J}_{i}}p_{t^{\prime}ij}-d_{t^{\prime}}\right)\leq s^{\max}\quad\forall\,t\in\mathcal{T},\,\bm{p}_{[t]}\in\mathcal{P}_{t}(\bm{x}_{[t]})$
	$\displaystyle\sum_{i\in\mathcal{I}}\sum_{j\in\mathcal{J}_{i}}x_{tij}(\bm{p}_{[t]})\leq 1\quad\forall\,t\in\mathcal{T},\,\bm{p}_{[t]}\in\mathcal{P}_{t}(\bm{x}_{[t]})$
	$\displaystyle s_{t}(\bm{p}_{[t]})\in\mathbb{R}_{+}\quad\forall\,t\in\mathcal{T},\,\bm{p}_{[t]}\in\mathcal{P}_{t}(\bm{x}_{[t]})$
	$\displaystyle x_{tij}(\bm{p}_{[t]})\in\{0,1\}\quad\forall\,t\in\mathcal{T},\,i\in\mathcal{I},\,j\in\mathcal{J}_{i},\,\bm{p}_{[t]}\in\mathcal{P}_{t}(\bm{x}_{[t]}),$

where we assume that the plant operating point is chosen among a discrete set of operating points, $\mathcal{J}_{i}$ , which depends on operating mode $i$ . Each operating point $j$ of mode $i$ is defined by a vector of inputs, $\bm{u}_{ij}$ . The binary variable $x_{tij}$ equals 1 if the plant operates at point $j$ of mode $i$ in time period $t$ . The amount produced, $p_{tij}$ , is treated as an uncertain parameter since the model parameters for the corresponding mode $i$ may be uncertain. Given cost coefficients $\bm{c}$ and $\delta$ , the objective is to minimize the worst-case cost over the entire scheduling horizon $\mathcal{T}=\{1,\dots,T\}$ . With an initial inventory $s_{0}$ , the inventory level at time $t$ , $s_{t}$ , is bounded between zero and $s^{\max}$ . The product demand in time period $t$ is denoted by $d_{t}$ .

We assume that the production amount in each operating mode $i$ is given by a linear function $p=a_{i}+\bm{b}_{i}^{\top}\bm{u}$ . For some modes, the model parameters $a_{i}$ and $\bm{b}_{i}$ are unknown at the beginning of the scheduling horizon, and they can only be estimated by operating in those modes and observing the resulting production amounts. This is captured in the following decision-dependent uncertainty set:

\small\mathcal{P}_{t}(\bm{x}_{[t]})=\left\{\bm{p}_{[t]}\in\mathbb{R}_{+}^{K_{t}}:\begin{array}[]{l}\exists\,a_{i},\,\bm{b}_{i}\;\forall\,i\in\mathcal{I}\quad\text{such that}\\[4.0pt] 0\leq p_{t^{\prime}ij}\leq p^{\max}_{i}x_{t^{\prime}ij}\quad\forall\,t^{\prime}=1,\dots,t,\,i\in\mathcal{I},\,j\in\mathcal{J}_{i}\\[4.0pt] a_{i}+\bm{b}_{i}^{\top}\bm{u}_{ij}-p_{i}^{\max}(1-x_{t^{\prime}ij})\leq p_{t^{\prime}ij}\leq a_{i}+\bm{b}_{i}^{\top}\bm{u}_{ij}\quad\forall\,t^{\prime}=1,\dots,t,\,i\in\mathcal{I},\,j\in\mathcal{J}_{i}\\[4.0pt] a^{\min}_{i}\leq a_{i}\leq a^{\max}_{i}\quad\forall\,i\in\mathcal{I}\\[4.0pt] \bm{b}^{\min}_{i}\leq\bm{b}_{i}\leq\bm{b}^{\max}_{i}\quad\forall\,i\in\mathcal{I}\end{array}\right\}

with $K_{t}=t\sum_{i\in\mathcal{I}}|\mathcal{J}_{i}|$ . Only if $x_{tij}=1$ , the production amount $p_{tij}$ can only take a nonzero value, which depends on the chosen operating point. Each operating point that has been selected up to time period $t$ activates a linear equation, which can be interpreted as a cut that reduces the size of the uncertainty set. As a result, the linear model of an operating mode is learned through the accumulation of observations, which are the production amounts at different operating points. If $\bm{u}_{ij}\in\mathbb{R}^{n}$ , and all model parameters of mode $i$ are unknown, one needs to operate in mode $i$ at different operating points at least $n+1$ times in order to uniquely determine $a_{i}$ and $\bm{b}_{i}$ .

We use a small instance to illustrate the effect of active learning in this problem. The data are as follows: $T=4$ , $c=5$ , $\delta=2$ , $\bm{d}=(32.5,15,51,42.5)$ , $s^{\max}=100$ and $s_{0}=0$ . We consider two operating modes and two possible operating points per mode, with each operating point $j\in\mathcal{J}_{i}$ defined by a scalar input $u_{ij}$ . We have $u_{11}=10$ , $u_{12}=46$ , $u_{21}=25$ , and $u_{22}=44$ . The values of the parameters describing the uncertainty set are shown in Table 4. Here, we assume that the plant has not been operated in any of the two modes prior to the beginning of the scheduling horizon.

Table 4: Parameters for the description of the uncertainty set in the production scheduling with active parameter estimation problem.

	$p^{\max}_{i}$	$a^{\min}_{i}$	$a^{\max}_{i}$	$b^{\min}_{i}$	$b^{\max}_{i}$
Mode 1	65.2	5	10	1	1.2
Mode 2	56.2	20	21	0.5	0.8

The optimal adjustable production schedule is shown in Figure 7. The solution suggests to first operate at point 1 of mode 2 and choose the subsequent schedule depending on the realization of $p_{121}$ . The schedule at the top is selected if $p_{121}\geq 36.75$ , and the bottom schedule is selected if $p_{121}<36.75$ . One can see that in both schedules, the operating points selected in time periods 2 and 3 are the same; the schedules only differ in the operating mode chosen for time period 4. By only observing the production amount at operating point 1 of mode 2, we cannot fully infer the model for mode 2, i.e. determine $a_{2}$ and $b_{2}$ . However, the observation reduces the uncertainty for mode 2, which allows us to determine in which mode the plant should operate in time period 4.

The effect of learning is further illustrated in Figure 8, which shows how the uncertainty set is reduced over time in a particular scenario with $p_{121}>36.75$ , in which case the schedule at the top of Figure 7 is applied. Each of the four subfigures shows the production amount as a function of the input for a specific time period and the mode chosen for that time period. The black solid vertical lines indicate the marginal uncertainty sets of the production amounts at the available operating points before any observations are made, as given by the ranges of possible values for $a_{i}$ and $b_{i}$ . In this particular scenario, a production amount of 39 is observed in time period 1 when the plant operates at point 1 of mode 2. Given this observation and $a_{2}\in[20,21]$ , the uncertainty with respect to $b_{2}$ is immediately reduced such that the line representing the linear model of mode 2 has to lie within the cyan shaded area shown in Figure 8a. One can see that this results in a substantial reduction of uncertainty with respect to the unobserved operating point 2 of mode 2. The same effect is seen in time period 2 when the plant operates at point 1 of mode 1 (Figure 8b). After operating at point 2 of mode 1 in time period 3, $a_{1}$ and $b_{1}$ can be uniquely determined given the observations $p_{211}$ and $p_{312}$ . The linear model of mode 1 is represented by the red solid line in Figure 8c. Finally, the linear model of mode 2 will also be exactly known after operating at point 2 of mode 2 in time period 4, as indicated in Figure 8d.

The optimal value of the problem is 819.2. Despite the simplicity of the illustrative example, the problem is computationally challenging. This is in part due to the large number of variables and constraints resulting from the reformulation, but even more due to the weak relaxation of the MILP. The latter is indicated by a lower bound that only improves very slowly in the branch-and-bound algorithm, despite the incumbent being the optimal solution already. Computational strategies for mitigating these challenges will be explored and discussed in future work.

8 Conclusions

In this work, we have developed an adjustable robust optimization framework that can simultaneously consider three types of endogenous uncertainty: type 1, type 2a, and type 2b, as defined in a new refined classification of endogenous uncertainty that distinguishes between the decision-dependent materialization and observation of uncertain parameters. One compelling insight from this extended view is that active learning, a sequential form of experimental design, can be formulated as an optimization problem under endogenous uncertainty. As a result, the proposed approach provides a means of performing integrated optimization and active learning in a sequential decision-making environment.

Our framework considers decision-dependent polyhedral uncertainty sets, and we have applied a decision rule approach based on the concept of lifted uncertainty that incorporates continuous and binary recourse, including recourse decisions that affect the uncertainty set. Multistage decision-dependent nonanticipativity is modeled using auxiliary uncertain parameters, which preserves the tractability of the decision rule approach. This eventually results in a mixed-integer optimization problem that can be directly solved using off-the-shelf solvers.

The proposed methodology significantly expands our capability to model and solve data-driven optimization problems under uncertainty, particularly those involving active learning. Computational challenges still exist, which we plan to address in our future work. Nonetheless, this approach promises to be especially suited for applications that involve complex constraints, expensive exploration actions, and strong links between different decision stages. To highlight the relevance of these kinds of problems, we have conducted computational case studies covering a variety of applications. The results demonstrate the versatility of the proposed modeling framework as well as the importance of endogenous uncertainty. As these aspects become increasingly relevant in PSE applications, we hope that this work can contribute toward the development of more comprehensive and efficient methods for such problems.

Acknowledgments

Wei Feng gratefully acknowledges the financial support from the China Scholarship Council (CSC) (No. 201906320317).

References

Ben-Tal and Nemirovski 1998 A. Ben-Tal and A. Nemirovski. Robust Convex Optimization. Mathematics of Operations Research, 23(4):769–805, 1998.
El Ghaoui et al. 1998 L. El Ghaoui, F. Oustry, and H. Lebret. Robust Solutions to Uncertain Semidefinite Programs. SIAM Journal on Optimization, 9(1):33–52, 1998.
Ben-Tal et al. 2004 A. Ben-Tal, A. Goryashko, E. Guslitzer, and A. Nemirovski. Adjustable robust solutions of uncertain linear programs. Mathematical Programming, 99(2):351–376, 2004.
Georghiou et al. 2019 A. Georghiou, D. Kuhn, and W. Wiesemann. The decision rule approach to optimization under uncertainty: methodology and applications. Computational Management Science, 16(4):545–576, 2019.
Bertsimas et al. 2013 D. Bertsimas, E. Litvinov, X. A. Sun, J. Zhao, and T. Zheng. Adaptive robust optimization for the security constrained unit commitment problem. IEEE Transactions on Power Systems, 28(1):52–63, 2013.
Zeng and Zhao 2013 B. Zeng and L. Zhao. Solving two-stage robust optimization problems using a column-and-constraint generation method. Operations Research Letters, 41(5):457–461, 2013.
Bertsimas and Caramanis 2010 D. Bertsimas and C. Caramanis. Finite adaptability in multistage linear optimization. IEEE Transactions on Automatic Control, 55(12):2751–2766, 2010.
Hanasusanto et al. 2015 G. A. Hanasusanto, D. Kuhn, and W. Wiesemann. K-Adaptability in Two-Stage Robust Binary Programming. Operations Research, 63(4):877–891, 2015.
Bertsimas and Georghiou 2015 D. Bertsimas and A. Georghiou. Design of Near Optimal Decision Rules in Multistage Adaptive Mixed-Integer Optimization. Operations Research, 63(3):610–627, 2015.
Postek and den Hertog 2016 K. Postek and D. den Hertog. Multistage Adjustable Robust Mixed-Integer Optimization via Iterative Splitting of the Uncertainty Set. INFORMS Journal on Computing, 28(3):553–574, 2016.
Bertsimas and Georghiou 2018 D. Bertsimas and A. Georghiou. Binary decision rules for multistage adaptive mixed-integer optimization. Mathematical Programming, 167(2):395–433, 2018.
Subramanyam et al. 2020 A. Subramanyam, C. E. Gounaris, and W. Wiesemann. K-adaptability in two-stage mixed-integer robust optimization. Mathematical Programming Computation, 12:193–224, 2020.
Yanıkoğlu et al. 2019 I. Yanıkoğlu, B. L. Gorissen, and D. den Hertog. A survey of adjustable robust optimization. European Journal of Operational Research, 277(3):799–813, 2019.
Lappas and Gounaris 2016 N. H. Lappas and C. E. Gounaris. Multi-Stage Adjustable Robust Optimization for Process Scheduling Under Uncertainty. AIChE Journal, 62(5):1646–1667, 2016.
Zhang et al. 2016a Q. Zhang, M. F. Morari, I. E. Grossmann, A. Sundaramoorthy, and J. M. Pinto. An adjustable robust optimization approach to scheduling of continuous industrial processes providing interruptible load. Computers and Chemical Engineering, 86:106–119, 2016a.
Shi and You 2016 H. Shi and F. You. A Computational Framework and Solution Algorithms for Two-Stage Adaptive Robust Scheduling of Batch Manufacturing Processes Under Uncertainty. AIChE Journal, 62(3):687–703, 2016.
Zhang et al. 2016b Q. Zhang, I. E. Grossmann, and R. M. Lima. On the Relation Between Flexibility Analysis and Robust Optimization for Linear Systems. AIChE Journal, 62(9):3109–2132, 2016b.
Grossmann et al. 2014 I. E. Grossmann, B. A. Calfa, and P. Garcia-Herreros. Evolution of concepts and models for quantifying resiliency and flexibility of chemical processes. Computers and Chemical Engineering, 70:22–34, 2014.
Yuan et al. 2018 Y. Yuan, Z. Li, and B. Huang. Nonlinear robust optimization for process design. AIChE Journal, 64(2):481–494, 2018.
Gong and You 2018 J. Gong and F. You. Resilient design and operations of process systems: Nonlinear adaptive robust optimization model and algorithm for resilience analysis and enhancement. Computers and Chemical Engineering, 116:231–252, 2018.
Ning and You 2017 C. Ning and F. You. A Data-Driven Multistage Adaptive Robust Optimization Framework for Planning and Scheduling Under Uncertainty. AIChE Journal, 63(10):4343–4369, 2017.
Lappas and Gounaris 2018a N. H. Lappas and C. E. Gounaris. Theoretical and computational comparison of continuous-time process scheduling models for adjustable robust optimization. AIChE Journal, 64(8):3055–3070, 2018a.
Feng et al. 2019 W. Feng, Y. Zhang, G. Rong, and Y. Feng. Finite Adaptability in Data-Driven Robust Optimization for Production Scheduling: A Case Study of the Ethylene Plant. Industrial & Engineering Chemistry Research, 58(16):6505–6518, 2019.
Rahal et al. 2020 S. Rahal, Z. Li, and D. J. Papageorgiou. Proactive and reactive scheduling of the steelmaking and continuous casting process through adaptive robust optimization. Computers and Chemical Engineering, 133:106658, 2020.
Zhang et al. 2017 X. Zhang, M. Kamgarpour, A. Georghiou, P. Goulart, and J. Lygeros. Robust optimal control with adjustable uncertainty sets. Automatica, 75:249–259, 2017.
Shang et al. 2019 C. Shang, W. H. Chen, and F. You. Robust constrained model predictive control of irrigation systems based on data-driven uncertainty set constructions. In Proceedings of the American Control Conference, pages 1906–1911. American Automatic Control Council, 2019.
Ning and You 2020 C. Ning and F. You. A transformation-proximal bundle algorithm for multistage adaptive robust optimization and application to constrained robust optimal control. Automatica, 113:108802, 2020.
Yue and You 2016 D. Yue and F. You. Optimal Supply Chain Design and Operations Under Multi-Scale Uncertainties: Nested Stochastic Robust Optimization Modeling Framework and Solution Algorithm. AIChE Journal, 62(9):3041–3055, 2016.
Subramanyam et al. 2018 A. Subramanyam, F. Mufalli, J. M. Pinto, and C. E. Gounaris. Robust Multi-Period Vehicle Routing under Customer Order Uncertainty. Optimization Online, 2018.
Jonsbråten et al. 1998 T. W. Jonsbråten, R. J.-B. Wets, and D. L. Woodruff. A class of stochastic programs with decision dependent random elements. Annals of Operations Research, 82:83–106, 1998.
Peeta et al. 2010 S. Peeta, F. S. Salman, D. Gunnec, and K. Viswanath. Pre-disaster investment decisions for strengthening a highway network. Computers and Operations Research, 37(10):1708–1719, 2010.
Escudero et al. 2018 L. F. Escudero, M. A. Garín, J. F. Monge, and A. Unzueta. On preparedness resource allocation planning for natural disaster relief under endogenous uncertainty with time-consistent risk-averse management. Computers and Operations Research, 98:84–102, 2018.
Hellemo et al. 2018 L. Hellemo, P. I. Barton, and A. Tomasgard. Decision-dependent probabilities in stochastic programs with recourse. Computational Management Science, 15(3-4):369–395, 2018.
Goel and Grossmann 2004 V. Goel and I. E. Grossmann. A stochastic programming approach to planning of offshore gas field developments under uncertainty in reserves. Computers and Chemical Engineering, 28(8):1409–1429, 2004.
Goel and Grossmann 2006 V. Goel and I. E. Grossmann. A class of stochastic programs with decision dependent uncertainty. Mathematical Programming, 108:355–397, 2006.
Colvin and Maravelias 2008 M. Colvin and C. T. Maravelias. A stochastic programming approach for clinical trial planning in new drug development. Computers and Chemical Engineering, 32(11):2626–2642, 2008.
Colvin and Maravelias 2010 M. Colvin and C. T. Maravelias. Modeling methods and a branch and cut algorithm for pharmaceutical clinical trial planning using stochastic programming. European Journal of Operational Research, 203(1):205–215, 2010.
Vayanos et al. 2011 P. Vayanos, D. Kuhn, and B. Rustem. Decision rules for information discovery in multi-stage stochastic programming. In Proceedings of the IEEE Conference on Decision and Control, pages 7368–7373, 2011.
Tarhan et al. 2013 B. Tarhan, I. E. Grossmann, and V. Goel. Computational strategies for non-convex multistage MINLP models with decision-dependent uncertainty and gradual uncertainty resolution. Annals of Operations Research, 203(1):141–166, 2013.
Gupta and Grossmann 2014 V. Gupta and I. E. Grossmann. A new decomposition algorithm for multistage stochastic programs with endogenous uncertainties. Computers and Chemical Engineering, 62:62–79, 2014.
Christian and Cremaschi 2015 B. Christian and S. Cremaschi. Heuristic solution approaches to the pharmaceutical R&D pipeline management problem. Computers and Chemical Engineering, 74:34–47, 2015.
Hooshmand and MirHassani 2016 F. Hooshmand and S. A. MirHassani. Efficient constraint reduction in multistage stochastic programming problems with endogenous uncertainty. Optimization Methods and Software, 31(2):359–376, 2016.
Apap and Grossmann 2017 R. M. Apap and I. E. Grossmann. Models and computational strategies for multistage stochastic programming under endogenous and exogenous uncertainties. Computers and Chemical Engineering, 103:233–274, 2017.
Poss 2013 M. Poss. Robust combinatorial optimization with variable budgeted uncertainty. 4OR, 11(1):75–92, 2013.
Nohadani and Sharma 2018 O. Nohadani and K. Sharma. Optimization under decision-dependent uncertainty. SIAM Journal on Optimization, 28(2):1773–1795, 2018.
Lappas and Gounaris 2018b N. H. Lappas and C. E. Gounaris. Robust optimization for decision-making under endogenous uncertainty. Computers and Chemical Engineering, 111:252–266, 2018b.
Avraamidou and Pistikopoulos 2019 S. Avraamidou and E. N. Pistikopoulos. Adjustable robust optimization through multi-parametric programming. Optimization Letters, 14(4):873–887, 2019.
Feng et al. 2020 W. Feng, Y. Feng, and Q. Zhang. Multistage Robust Mixed-Integer Optimization Under Endogenous Uncertainty. arXiv preprint arXiv:2008.11633, 2020.
Bertsimas and Vayanos 2014 D. Bertsimas and P. Vayanos. Data-driven learning in dynamic pricing using adaptive optimization. Optimization Online, 2014.
Vayanos et al. 2020a P. Vayanos, A. Georghiou, and H. Yu. Robust optimization with decision-dependent information discovery. arXiv preprint arXiv:2004.08490, 2020a.
Vayanos et al. 2020b P. Vayanos, D. C. Mcelfresh, Y. Ye, J. P. Dickerson, and E. Rice. Active Preference Elicitation via Adjustable Robust Optimization. arXiv preprint arXiv:2003.01899, 2020b.
Shahriari et al. 2016 B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. De Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016.
Sutton and Barto 2018 R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. MIT press, 2018.
Audibert et al. 2010 J. Y. Audibert, S. Bubeck, and R. Munos. Best arm identification in multi-armed bandits. COLT 2010 - The 23rd Conference on Learning Theory, pages 41–53, 2010.
Georghiou et al. 2015 A. Georghiou, W. Wiesemann, and D. Kuhn. Generalized decision rule approximations for stochastic programming via liftings. Mathematical Programming, 152(1-2):301–338, 2015.
Gorissen et al. 2015 B. L. Gorissen, I. Yanikoğlu, and D. den Hertog. A practical guide to robust optimization. Omega, 53:124–137, 2015.
Lubin and Dunning 2015 M. Lubin and I. Dunning. Computing in Operations Research Using Julia. INFORMS Journal on Computing, 27(2):237–248, 2015.
Basciftci et al. 2019 B. Basciftci, S. Ahmed, and N. Gebraeel. Adaptive Two-stage Stochastic Programming with an Application to Capacity Expansion Planning. arXiv preprint arXiv:1906.03513, 2019.
Min et al. 2018 D. Min, J.-h. Ryu, and D. G. Choi. A long-term capacity expansion planning model for an electric power system integrating large-size renewable energy technologies. Computers and Operations Research, 96:244–255, 2018.