This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Simple method for efficiently solving dynamic models with continuous actions using policy gradient

Takeshi Fukasawa Graduate School of Public Policy, The University of Tokyo. Email: [email protected]
The replication code of the numerical experiments in this article is available at https://github.com/takeshi-fukasawa/dynamic_opt_conti_action.
Abstract

This study proposes the Value Function-Policy Gradient Iteration-Spectral (VF-PGI-Spectral) algorithm, which efficiently solves discrete-time infinite-horizon dynamic models with continuous actions. It combines the spectral algorithm to accelerate convergence. The method is applicable not only to single-agent dynamic optimization problems, but also to multi-agent dynamic games, which previously proposed methods cannot deal with. Moreover, the proposed algorithm is not limited to models with specific functional forms, and applies to models with multiple continuous actions. This study shows the results of numerical experiments, showing the effective performance of the proposed algorithm.

Keywords: Dynamic optimization with continuous actions; Single-agent dynamic models; Multi-agent dynamic games; VF-PGI-Spectral algorithm

1 Introduction

Dynamic optimization problems involving continuous actions are widespread in economics and other fields. Examples include consumption, investment, R&D, advertising, and dynamic pricing decisions, which are extensively studied in the literature of macroeconomics, industrial organization, labor economics, and others. Although such problems are common and important, one challenge to conducting quantitative analyses is the computational burden. When we apply standard value function iteration (VFI) to solve a discrete-time infinite-horizon dynamic model, the computational cost is generally determined by the number of iterations until convergence and the computational cost per iteration. When applying the VFI to solve a dynamic model, we must solve for optimal continuous actions aa for each state given the candidate values of value functions VV. The optimization problem is typically nonlinear, and the computational step can be time-consuming.111Discretizing the possible actions may help avoid the nonlinear optimization problem, but coarse discretized action can reduce the precision of the approximation. In contrast, when the number of possible actions is large, evaluating long-term profits for all the possible actions can be time-consuming. Hence, this approach has limitations. See also the discussion in Section 12 of Judd, (1998).

This study introduces a new, simple, computationally efficient “Value Function-Policy Gradient Iteration-Spectral (VF-PGI-Spectral) algorithm” for quantitatively solving discrete-time infinite-horizon dynamic models with continuous actions. Though the proposed algorithm inherits the idea of the VFI, we jointly update value functions VV and action variables aa (policy functions) in each iteration. Value functions VV are updated as in the value function iteration with candidate values of actions aa. Continuous actions aa are updated so that they step in the direction of the gradient of the action value functions QQ, given candidate value functions VV. Action value functions are the dynamic version of “profit.” The computational cost per iteration is low because these updating steps do not require nonlinear root-finding or optimization procedures. Note that we can easily introduce box constraints on the domain of actions, such as nonnegativity constraints.

This study combines the spectral algorithm, known for its effectiveness in reducing the number of iterations until convergence to solve nonlinear equations, mainly in the field of numerical analysis. The spectral algorithm share similarities with Newton’s method, well known for its rapid local convergence. Unlike Newton’s method, we do not have to specify the first derivative to be solved, which is the advantage of the spectral algorithm. This study shows the results of numerical experiments, where the VF-PGI-Spectral algorithm is dozens of times faster than the traditional VFI algorithm and previously proposed methods.

The proposed VF-PGI-Spectral algorithm offer the following advantages. First, the algorithm is intuitive, and it is easy to code. Second, the algorithm is not limited to models with specific functional forms, and it applies to both single-agent dynamic models and multi-agent dynamic games. It is also applicable to dynamic models with multiple continuous actions. Third, it attains fast convergence, because, unlike the traditional VFI, it does not require solving nonlinear optimization problems in the iteration.

The remainder of this paper is presented as follows: Section 3 discusses the proposed algorithm. In Section 4, we examine how to incorporate box constraints related to continuous actions, such as nonnegative constraints, into the proposed algorithm. Section 5 shows the results of numerical experiments. Sections 2 and 6 discuss the relationships with the previous studies and previously proposed methods. Section 7 finally concludes. Section A shows all proofs of mathematical statements. Section B describes the details of the numerical experiments, and Section C shows additional numerical results.

2 Literature

This study contributes to the literature on computationally efficient algorithms for solving dynamic optimization problems with continuous actions. Carroll, (2006) proposed the Endogenous Gridpoint Method (EGM), which avoids time-consuming nonlinear optimization steps, for solving single-agent dynamic optimization problems with a continuous action. Maliar and Maliar, (2013) and Arellano et al., (2016) proposed the Envelope Condition Method (ECM) for single-agent dynamic models with continuous actions.

Though useful, their applicability largely depends on the functional form of the models. Moreover, as discussed in Section 6, they are not directly applicable to multi-agent dynamic games because of strategic interactions. Furthermore, when extending these methods to single-agent dynamic models with multiple continuous actions, we additionally need to solve time-consuming nonlinear equations in some models.222Iskhakov, (2015) discusses the class of models with multiple continuous actions where EGM can be applied without additional nonlinear problems. Unlike other methods, the proposed VF-PGI-Spectral algorithm is not restricted to models with specific functional forms, and applies to both single-agent and multi-agent dynamic models with single and multiple continuous actions.

This study builds on the existing literature on numerical analysis of the spectral algorithm. The spectral algorithm was first proposed to solve nonlinear continuous optimization problems (Barzilai and Borwein,, 1988) and later extended to solve nonlinear equations (La Cruz et al.,, 2006; Varadhan and Roland,, 2008). When dealing with dynamic optimization problems with continuous actions, essentially, we should solve two types of problems: nonlinear equations related to value functions, and nonlinear optimization problems related to continuous actions. The current study builds on the literature, and proposes a new algorithm for solving dynamic optimization models with continuous actions efficiently.

The idea of introducing the spectral algorithm to efficiently solve nonlinear problems is not new in the economics literature. In studies on industrial organization, Reynaerts et al., (2012) and Conlon and Gortmaker, (2020) examined the use of spectral-type algorithms to reduce the number of iterations in the inner loop of estimating static random-coefficient logit model with aggregate data (BLP method proposed by Berry et al.,, 1995).333These studies also introduced the SQUAREM algorithm, which can be considered as a minor extension of the spectral algorithm. See Fukasawa, (2024) for details. Aguirregabiria and Marcoux, (2021) introduced the spectral algorithm in the context of nested-pseudo likelihood estimation of dynamic discrete games. Fukasawa, (2024), who proposed novel, fast, and simple algorithms for efficiently solving the inner loop of static/dynamic BLP estimations, also introduced spectral-type algorithms, and emphasized the cautious selection of step sizes. The current study builds on these findings, and discusses computationally efficient algorithms for solving dynamic models with continuous actions.

The concept of using gradients as an updating direction444In standard continuous optimization problems, the gradient descent-type methods also use gradients as an updating direction. can be found in the reinforcement learning literature (Deterministic policy gradient method; Silver et al.,, 2014). However, the settings may differ, and we discuss these differences in Section 6.

3 Model and Algorithms

In this section, we first examine solution methods for single-agent dynamic optimization problems (Section 3.1), and subsequently examine multi-agent dynamic games (Section 3.2). Although single-agent dynamic models are special models of multi-agent dynamic games, we first discuss the former to clarify the point. Readers who are only interested in single-agent models can skip Section 3.2. Section 3.3 discusses the spectral algorithm that effectively reduces the number of iterations until convergence. In the following sections, the primes on variables denote the next-period variables.

3.1 Single-agent dynamic model

Consider a single-agent infinite-horizon dynamic optimization problem with continuous action variables a(s)a(s) characterized by the following Bellman equation:

V(s)\displaystyle V(s) =\displaystyle= maxa(s)𝒜(s)Q(a(s),s;V)r(a(s),s)+βV(s)p(s|s,a(s))𝑑s,\displaystyle\max_{a(s)\in\mathcal{A}(s)}Q(a(s),s;V)\equiv r(a^{*}(s),s)+\beta\int V\left(s^{\prime}\right)p\left(s^{\prime}|s,a^{*}(s)\right)ds^{\prime},

where a(s)(ad(s))d=1,,Da(s)\equiv\left(a^{d}(s)\right)_{d=1,\cdots,D} denotes the action of the agent at state ss. Note that the action can be multi-dimensional, and let DD be the dimension of the actions. Let a(s)a^{*}(s) be the solution to the maximization problem. 𝒮\mathcal{S} denotes the state space, which can be either discrete or continuous. 𝒜(s)\mathcal{A}(s) denotes the convex closed set of possible actions at state ss. For now, we assume 𝒜(s)=D\mathcal{A}(s)=\mathbb{R}^{D}, but we can easily introduce box constraints, as discussed in Section 4. p(s|s,a(s))p\left(s^{\prime}|s,a(s)\right) denotes the state transition probability. Q(a(s),s;V)Q(a(s),s;V) denotes the action value function, and V(s)V(s) denotes the value function. Note that QQ depends on the value function VV, and Q(a(s),s;V)=V(s)Q\left(a^{*}(s),s;V\right)=V(s) holds.

Assuming that QQ is differentiable with respect to aa, the following equations hold:

0\displaystyle 0 =\displaystyle= Q(a(s),s;V)a(s)\displaystyle\frac{\partial Q(a^{*}(s),s;V)}{\partial a^{*}(s)} (1)
=\displaystyle= r(a(s),s)a(s)+β[V(s)p(s|s,a(s))𝑑s]a(s),\displaystyle\frac{\partial r\left(a^{*}(s),s\right)}{\partial a^{*}(s)}+\beta\frac{\partial\left[\int V\left(s^{\prime}\right)p\left(s^{\prime}|s,a^{*}(s)\right)ds^{\prime}\right]}{\partial a^{*}(s)},
V(s)\displaystyle V(s) =\displaystyle= r(a(s),s)+βV(s)p(s|s,a(s))𝑑s.\displaystyle r(a^{*}(s),s)+\beta\int V\left(s^{\prime}\right)p\left(s^{\prime}|s,a^{*}(s)\right)ds^{\prime}. (2)

Then, the problem reduces to solving the nonlinear equations (1) and (2) for {a(s)}s𝒮\left\{a^{*}(s)\right\}_{s\in\mathcal{S}} and {V(s)}s𝒮\left\{V(s)\right\}_{s\in\mathcal{S}}. In general, the state space of ss is not finite. In this case, we should take suitable finite set of grid points 𝒮^𝒮\widehat{\mathcal{S}}\subset\mathcal{S} to approximate the solution well. If the state space 𝒮\mathcal{S} is finite, let 𝒮^=𝒮\widehat{\mathcal{S}}=\mathcal{S}. The alternative equations that we should solve are as follows:

0\displaystyle 0 =\displaystyle= Q(a(s^),s^;V)a(s^)\displaystyle\frac{\partial Q(a^{*}\left(\widehat{s}\right),\widehat{s};V)}{\partial a^{*}\left(\widehat{s}\right)} (3)
=\displaystyle= r(a(s^),s^)a(s^)+β[V¯(s)p(s|s^,a(s^))𝑑s]a(s^),\displaystyle\frac{\partial r\left(a^{*}\left(\widehat{s}\right),\widehat{s}\right)}{\partial a^{*}\left(\widehat{s}\right)}+\beta\frac{\partial\left[\int\overline{V}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a^{*}\left(\widehat{s}\right)\right)ds^{\prime}\right]}{\partial a^{*}\left(\widehat{s}\right)},
V(s^)\displaystyle V\left(\widehat{s}\right) =\displaystyle= r(a(s^),s^)+βV¯(s)p(s|s^,a(s^))𝑑s.\displaystyle r\left(a^{*}\left(\widehat{s}\right),\widehat{s}\right)+\beta\int\overline{V}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a^{*}\left(\widehat{s}\right)\right)ds^{\prime}. (4)

Here, V¯\overline{V} denotes the interpolated values of VV.555For example, when using Chebyshev polynomials as basis functions, V¯\overline{V} can be constructed by V¯(s)=m=0M1θmTm(s)\overline{V}(s)=\sum_{m=0}^{M-1}\theta_{m}T_{m}(s), where Tm(s)T_{m}(s) denotes the order-mm Chebyshev basis function. θm=0,1,,M\theta_{m=0,1,\cdots,M} are unknown coefficients, and these values are computed based on the values of {V(st(grid))}st(grid)𝒮(grid)\left\{V(s_{t}^{(grid)})\right\}_{s_{t}^{(grid)}\in\mathcal{S}^{(grid)}}. Note that TmT_{m} satisfies T0(s)=1,T1(s)=s,T_{0}(s)=1,T_{1}(s)=s, and Tm(s)=2sTm1(s)Tm2(s)(m2)T_{m}(s)=2sT_{m-1}(s)-T_{m-2}(s)\ (m\geq 2). If 𝒮^𝒮\widehat{\mathcal{S}}\subsetneq\mathcal{S}, we cannot directly evaluate the values of V(s)s𝒮^V(s)\ s\notin\widehat{\mathcal{S}}, and we interpolate these values based on the values of {V(s^)}s^𝒮^\left\{V(\widehat{s})\right\}_{\widehat{s}\in\text{$\widehat{\mathcal{S}}$}}. We solve these equations for {a(s^)}s^𝒮^\left\{a^{*}(\widehat{s})\right\}_{\widehat{s}\in\widehat{\mathcal{S}}} and {V(s^)}s^𝒮^\left\{V(\widehat{s})\right\}_{\widehat{s}\in\widehat{\mathcal{S}}}.

Algorithm 1 shows the proposed Value Function-Policy Gradient Iteration (VF-PGI) algorithm.

  1. 1.

    Take grid points s^𝒮^\widehat{s}\in\widehat{\mathcal{S}}. Set initial values V(0){V(0)(s^)}s^𝒮^V^{(0)}\equiv\left\{V^{(0)}\left(\widehat{s}\right)\right\}_{\widehat{s}\in\widehat{\mathcal{S}}} and a(0){a(0)(s^)}s^𝒮^a^{(0)}\equiv\left\{a^{(0)}\left(\widehat{s}\right)\right\}_{\widehat{s}\in\widehat{\mathcal{S}}}

  2. 2.

    Iterate the following (n=0,1,)(n=0,1,\cdots):

    1. (a)

      Find a(n)(s^)a^{(n*)}\left(\widehat{s}\right) by:

      a(n)(s^)=Φa(V(n),a(n))(s^)a(n)(s^)+λQ(a(n)(s^),s^;V(n))a(n)(s^)a^{(n*)}\left(\widehat{s}\right)=\Phi_{a}\left(V^{(n)},a^{(n)}\right)\left(\widehat{s}\right)\equiv a^{(n)}\left(\widehat{s}\right)+\lambda\cdot\frac{\partial Q\left(a^{(n)}\left(\widehat{s}\right),\widehat{s};V^{(n)}\right)}{\partial a^{(n)}\left(\widehat{s}\right)}

      where

      Q(a(n)(s^),s^;V(n))a(n)(s^)\displaystyle\frac{\partial Q\left(a^{(n)}\left(\widehat{s}\right),\widehat{s};V^{(n)}\right)}{\partial a^{(n)}\left(\widehat{s}\right)}
      =\displaystyle= r(a(n)(s^),s^)a(n)(s^)+β[V¯(n)(s)p(s|s^,a(n)(s^))𝑑s]a(n)(s^)\displaystyle\frac{\partial r\left(a^{(n)}\left(\widehat{s}\right),\widehat{s}\right)}{\partial a^{(n)}\left(\widehat{s}\right)}+\beta\frac{\partial\left[\int\overline{V}^{(n)}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a^{(n)}\left(\widehat{s}\right)\right)ds^{\prime}\right]}{\partial a^{(n)}\left(\widehat{s}\right)}
    2. (b)

      Find

      V(n)(s^)\displaystyle V^{(n*)}\left(\widehat{s}\right) =\displaystyle= ΦV(V(n),a(n))(s^)\displaystyle\Phi_{V}\left(V^{(n)},a^{(n)}\right)\left(\widehat{s}\right)
      \displaystyle\equiv r(a(n)(s^),s^)+βV¯(n)(s)p(s|s^,a(n)(s^))𝑑s\displaystyle r\left(a^{(n)}\left(\widehat{s}\right),\widehat{s}\right)+\beta\int\overline{V}^{(n)}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a^{(n)}\left(\widehat{s}\right)\right)ds^{\prime}

      for s^𝒮^\widehat{s}\in\widehat{\mathcal{S}}

    3. (c)

      Update VV and aa in either of the following ways:

      • Simple: V(n+1)(s^)V(n)(s^)V^{(n+1)}\left(\widehat{s}\right)\leftarrow V^{(n*)}\left(\widehat{s}\right), a(n+1)(s^)a(n)(s^)a^{(n+1)}\left(\widehat{s}\right)\leftarrow a^{(n*)}\left(\widehat{s}\right)

      • Spectral algorithm: For z{V,ad=1,,D},z\in\left\{V,a^{d=1,\cdots,D}\right\},z(n+1)(s^)αz(n)z(n)(s^)+(1αz(n))z(n)(s^)z^{(n+1)}\left(\widehat{s}\right)\leftarrow\alpha_{z}^{(n)}z^{(n*)}\left(\widehat{s}\right)+(1-\alpha_{z}^{(n)})z^{(n)}\left(\widehat{s}\right), where αz(n1)z(n)z(n1)2Fz(V(n),a(n))Fz(V(n1),a(n1))2\alpha_{z}^{(n\geq 1)}\equiv\frac{\left\|z^{(n)}-z^{(n-1)}\right\|_{2}}{\left\|F_{z}\left(V^{(n)},a^{(n)}\right)-F_{z}\left(V^{(n-1)},a^{(n-1)}\right)\right\|_{2}}, Fz(V,a)Φz(V,a)z.F_{z}(V,a)\equiv\Phi_{z}\left(V,a\right)-z.

    4. (d)

      Exit the iteration if V(n+1)V(n)<ϵV\left\|V^{(n+1)}-V^{(n)}\right\|<\epsilon_{V} and ad(n+1)ad(n)<ϵad(d=1,,D)\left\|a^{d(n+1)}-a^{d(n)}\right\|<\epsilon_{a^{d}}\ (d=1,\cdots,D)

Algorithm 1 VF-PGI algorithm (Single-agent dynamic model)

Regarding the VF-PGI algorithm (Algorithm 1), we can intuitively explain the steps and treat the iterations as a learning process. We first set the initial values of value function VV and actions aa. If the gradient of “long-run profit” QQ with respect to the action given the initial value functions and the action in a state is positive, we increase the value of the action, expecting higher long-term profit. If negative, we reduce the value of the action. When the gradient is close to zero, the action is likely to be optimal given value functions, and we rarely change the value of the action. As for value functions, we update the values so that they are consistent with Bellman equations given initial value functions and actions. Note that the initial actions in this step might not be optimal. We repeat these steps until the process when the initial and updated values are sufficiently close to each other, and then terminate the process.

In the algorithm, we can combine the spectral algorithm, which has been applied to accelerate the iterations in solving continuous optimization problems and nonlinear equations. When introducing the spectral algorithm, the values of V,aV,a are updated by z(n+1)(s^)αz(n)z(n)(s^)+(1αz(n))z(n)(s^)z{V,ad=1,,D}z^{(n+1)}\left(\widehat{s}\right)\leftarrow\alpha_{z}^{(n)}z^{(n*)}\left(\widehat{s}\right)+(1-\alpha_{z}^{(n)})z^{(n)}\left(\widehat{s}\right)\ z\in\left\{V,a^{d=1,\cdots,D}\right\}, where αz(n)z(n)z(n1)2Fz(V(n),a(n))Fz(V(n1),a(n1))2\alpha_{z}^{(n)}\equiv\frac{\left\|z^{(n)}-z^{(n-1)}\right\|_{2}}{\left\|F_{z}\left(V^{(n)},a^{(n)}\right)-F_{z}\left(V^{(n-1)},a^{(n-1)}\right)\right\|_{2}}. Here, αz(n)\alpha_{z}^{(n)} denote step sizes. The values of step sizes αz(n)\alpha_{z}^{(n)} work as learning rates, and the spectral algorithm accelerates the learning by automatically choosing the values of step sizes according to a predetermined formula. For details of the spectral algorithm, see Section 3.3. Note that without combining the spectral algorithm, the VF-PGI algorithm works poorly due to slow convergence, and introducing the spectral algorithm is essential.

To compare with the traditional Value Function Iteration (VFI) method, I also show the VFI algorithm in Algorithm 2. In VFI, we need to solve a nonlinear equation related to optimal actions at each grid point, as shown in Step 2(a). Because solving nonlinear equations themselves usually takes nonnegligible time, the step is computationally costly. In contrast, in VF-PGI, we do not have to solve nonlinear equations to update the values of actions, as shown in Step 2(a). Though we have to update value functions and actions jointly, we can expect that the computational cost per iteration would be lower.

  1. 1.

    Take grid points s^𝒮^\widehat{s}\in\widehat{\mathcal{S}}. Set initial values V(0){V(0)(s^)}s^𝒮^V^{(0)}\equiv\left\{V^{(0)}\left(\widehat{s}\right)\right\}_{\widehat{s}\in\widehat{\mathcal{S}}}

  2. 2.

    Iterate the following (n=0,1,)(n=0,1,\cdots):

    1. (a)

      Solve for a(n)(s^)=Φa(V(n),a(n))(s^)a^{(n*)}\left(\widehat{s}\right)=\Phi_{a}\left(V^{(n)},a^{(n)}\right)\left(\widehat{s}\right) satisfying the following nonlinear equation for each s^𝒮^\widehat{s}\in\widehat{\mathcal{S}}:

      0\displaystyle 0 =\displaystyle= Q(a(n)(s^),s^;V(n))a(n)(s^)\displaystyle\frac{\partial Q(a^{(n*)}\left(\widehat{s}\right),\widehat{s};V^{(n)})}{\partial a^{(n*)}\left(\widehat{s}\right)}
      =\displaystyle= r(a(n)(s^),s^)a(n)(s^)+β[V¯(n)(s)p(s|s^,a(n)(s^))𝑑s]a(n)(s^)\displaystyle\frac{\partial r\left(a^{(n*)}\left(\widehat{s}\right),\widehat{s}\right)}{\partial a^{(n*)}\left(\widehat{s}\right)}+\beta\frac{\partial\left[\int\overline{V}^{(n)}\left(\text{$s^{\prime}$}\right)p\left(s^{\prime}|\widehat{s},a^{(n*)}\left(\widehat{s}\right)\right)ds^{\prime}\right]}{\partial a^{(n*)}\left(\widehat{s}\right)}
    2. (b)

      Find

      V(n)(s^)\displaystyle V^{(n*)}\left(\widehat{s}\right) =\displaystyle= r(a(n)(s^),s^)+βV¯(n)(s)p(s|s^,a(n)(s^))𝑑s\displaystyle r\left(a^{(n*)}\left(\widehat{s}\right),\widehat{s}\right)+\beta\int\overline{V}^{(n)}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a^{(n*)}\left(\widehat{s}\right)\right)ds^{\prime}

      for s^𝒮^\widehat{s}\in\widehat{\mathcal{S}}

    3. (c)

      Update VV in either of the following ways:

      • Simple: V(n+1)(s^)V(n)(s^)V^{(n+1)}\left(\widehat{s}\right)\leftarrow V^{(n*)}\left(\widehat{s}\right)

      • Spectral algorithm: V(n+1)(s^)αV(n)V(n)(s^)+(1αV(n))V(n)(s^)V^{(n+1)}\left(\widehat{s}\right)\leftarrow\alpha_{V}^{(n)}V^{(n*)}\left(\widehat{s}\right)+(1-\alpha_{V}^{(n)})V^{(n)}\left(\widehat{s}\right),

        where αV(n1)V(n)V(n1)2FV(V(n),a(n))FV(V(n1),a(n1))2\alpha_{V}^{(n\geq 1)}\equiv\frac{\left\|V^{(n)}-V^{(n-1)}\right\|_{2}}{\left\|F_{V}\left(V^{(n)},a^{(n)}\right)-F_{V}\left(V^{(n-1)},a^{(n-1)}\right)\right\|_{2}}, FV(V,a)ΦV(V,a)VF_{V}(V,a)\equiv\Phi_{V}\left(V,a\right)-V.

    4. (d)

      Exit the iteration if V(n+1)V(n)<ϵV\left\|V^{(n+1)}-V^{(n)}\right\|<\epsilon_{V}

Algorithm 2 VFI algorithm (Single-agent dynamic model)
Remark 1.

In VF-PGI, the parameter λ\lambda corresponds to the learning rate, and the iteration can be unstable when λ\lambda is large. In contrast, a small λ\lambda can lead to slow convergence. Note that the convergence speed is not so sensitive to the value of λ\lambda as long as λ\lambda is relatively small, when we combine the spectral algorithm. See also the numerical results in Section 5.1.

Remark 2.

Unlike VFI under regularity conditions, there is no guarantee that VF-PGI is a contraction. However, in the numerical experiments, VF-PGI-Spectral converges for appropriate λ\lambda.

3.2 Multi-agent dynamic game

Next, we consider a dynamic model in which multiple agents interact over time. Let πj(at+τ,st+τ)\pi_{j}\left(a_{t+\tau},s_{t+\tau}\right) be agent jj’s profit when they choose actions at+τ(ajt+τ)j𝒥a_{t+\tau}\equiv\left(a_{jt+\tau}^{*}\right)_{j\in\mathcal{J}} at state st+τs_{t+\tau}. Here, 𝒥\mathcal{J} denotes the set of agents. Given a current state sts_{t}, agent jj’s expected future profit is E[τ=0βτπj(at+τ,st+τ)|st]E\left[\sum_{\tau=0}^{\infty}\beta^{\tau}\pi_{j}\left(a_{t+\tau},s_{t+\tau}\right)|s_{t}\right]. Actions can be multidimensional, and let aj(ajd)d=1,,Da_{j}\equiv\left(a_{j}^{d}\right)_{d=1,\cdots,D}.

Here, we focus on pure strategy Markov perfect equilibria. We assume the existence of such an equilibrium, and let aj:𝒮𝒜ja_{j}^{*}:\mathcal{S}\rightarrow\mathcal{A}_{j} be the Markov strategy of agent jj. a(aj)j𝒥a^{*}\equiv\left(a_{j}^{*}\right)_{j\in\mathcal{J}} denotes the profile of the vector, where a:𝒮𝒜a^{*}:\mathcal{S}\rightarrow\mathcal{A}.

Then, the value function of firm jj given a state ss can be written recursively:

Vj(s)\displaystyle V_{j}(s) =\displaystyle= maxaj(s)𝒜j(s)Qj(aj(s),aj(s),s;Vj)\displaystyle\max_{a_{j}(s)\in\mathcal{A}_{j}(s)}Q_{j}\left(a_{j}(s),a_{-j}^{*}(s),s;V_{j}\right)
\displaystyle\equiv rj(aj(s),aj(s),s)+βVj(s)p(s|s,aj(s),aj(s))𝑑s,\displaystyle r_{j}(a_{j}^{*}(s),a_{-j}^{*}(s),s)+\beta\int V_{j}\left(s^{\prime}\right)p(s^{\prime}|s,a_{j}^{*}(s),a_{-j}^{*}(s))ds^{\prime},

where aj(s)a_{j}(s) is the action of agent jj at state ss. Qj(aj(s),aj(s),s;Vj)Q_{j}\left(a_{j}(s),a_{-j}(s),s;V_{j}\right) is the action value function of firm jj, and Vj(s)V_{j}(s) is the value function. p(s|s,a(s))p\left(s^{\prime}|s,a(s)\right) is the state transition probability. Note that Qj(aj(s),aj(s),s;Vj)=Vj(s)Q_{j}\left(a_{j}^{*}(s),a_{-j}^{*}(s),s;V_{j}\right)=V_{j}(s) holds.

If the action aj(s)a_{j}(s) is a continuous variable and QjQ_{j} is differentiable with respect to aj(s)a_{j}(s), the following equations hold:

0\displaystyle 0 =\displaystyle= Qj(aj(s),aj(s),s;Vj)aj(s)\displaystyle\frac{\partial Q_{j}(a_{j}^{*}(s),a_{-j}^{*}(s),s;V_{j})}{\partial a_{j}^{*}(s)} (5)
=\displaystyle= rj(aj(s),aj(s),s)aj(s)+β[Vj(s)p(s|s,aj(s),aj(s))𝑑s]aj(s),\displaystyle\frac{\partial r_{j}\left(a_{j}^{*}(s),a_{-j}^{*}(s),s\right)}{\partial a_{j}^{*}(s)}+\beta\frac{\partial\left[\int V_{j}\left(s^{\prime}\right)p(s^{\prime}|s,a_{j}^{*}(s),a_{-j}^{*}(s))ds^{\prime}\right]}{\partial a_{j}^{*}(s)},
Vj(s)\displaystyle V_{j}(s) =\displaystyle= rj(a(s),s)+βVj(s)p(s|s,a(s))𝑑s.\displaystyle r_{j}(a^{*}(s),s)+\beta\int V_{j}\left(s^{\prime}\right)p\left(s^{\prime}|s,a^{*}(s)\right)ds^{\prime}. (6)

Then, the problem reduces to solving the nonlinear equations (5) and (6) concerning {aj(s)}j𝒥,s𝒮\left\{a_{j}^{*}(s)\right\}_{j\in\mathcal{J},s\in\mathcal{S}} and {Vj(s)}j𝒥,s𝒮\left\{V_{j}(s)\right\}_{j\in\mathcal{J},s\in\mathcal{S}}.

In general, the state space of ss is not finite. In this case, we should take suitable finite sets of grid points s^𝒮^𝒮\widehat{s}\in\widehat{\mathcal{S}}\subset\mathcal{S} to well-approximate the solution. Then, we should alternatively solve the following system of equations:

0\displaystyle 0 =\displaystyle= Qj(aj(s^),aj(s^),s^;Vj)aj(s^)\displaystyle\frac{\partial Q_{j}\left(a_{j}^{*}\left(\widehat{s}\right),a_{-j}^{*}\left(\widehat{s}\right),\widehat{s};V_{j}\right)}{\partial a_{j}^{*}\left(\widehat{s}\right)} (7)
rj(aj(s^),aj(s^),s)aj(s^)+β[Vj¯(s)p(s|s^,aj(s^),aj(s^))𝑑s]aj(s^),\displaystyle\frac{\partial r_{j}\left(a_{j}^{*}\left(\widehat{s}\right),a_{-j}^{*}\left(\widehat{s}\right),s\right)}{\partial a_{j}^{*}\left(\widehat{s}\right)}+\beta\frac{\partial\left[\int\overline{V_{j}}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a_{j}^{*}\left(\widehat{s}\right),a_{-j}^{*}\left(\widehat{s}\right)\right)ds^{\prime}\right]}{\partial a_{j}^{*}\left(\widehat{s}\right)},
Vj(s^)\displaystyle V_{j}\left(\widehat{s}\right) =\displaystyle= rj(a(s^),s^)+βVj¯(s)p(s|s^,a(s^))𝑑s.\displaystyle r_{j}\left(a^{*}\left(\widehat{s}\right),\widehat{s}\right)+\beta\int\overline{V_{j}}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a^{*}\left(\widehat{s}\right)\right)ds^{\prime}. (8)

Algorithm 3 shows the proposed VF-PGI algorithm for the dynamic game.

  1. 1.

    Take grid points s^𝒮^\widehat{s}\in\widehat{\mathcal{S}}. Set initial values V(0){Vj(0)(s^)}j𝒥,s^𝒮^V^{(0)}\equiv\left\{V_{j}^{(0)}\left(\widehat{s}\right)\right\}_{j\in\mathcal{J},\widehat{s}\in\widehat{\mathcal{S}}} and a(0){aj(0)(s^)}j𝒥,s^𝒮^a^{(0)}\equiv\left\{a_{j}^{(0)}\left(\widehat{s}\right)\right\}_{j\in\mathcal{J},\widehat{s}\in\widehat{\mathcal{S}}}

  2. 2.

    Iterate the following (n=0,1,)(n=0,1,\cdots):

    1. (a)

      Solve for {aj(n)(s^)}j𝒥,s^𝒮^\left\{a_{j}^{(n*)}\left(\widehat{s}\right)\right\}_{j\in\mathcal{J},\widehat{s}\in\widehat{\mathcal{S}}}:

      aj(n)(s^)=Φa,j(V(n),a(n))(s^)aj(n)(s^)+λQj(aj(n)(s^),aj(n)(s^),s^;Vj(n))aj(n)(s^)a_{j}^{(n*)}\left(\widehat{s}\right)=\Phi_{a,j}\left(V^{(n)},a^{(n)}\right)\left(\widehat{s}\right)\equiv a_{j}^{(n)}\left(\widehat{s}\right)+\lambda\cdot\frac{\partial Q_{j}\left(a_{j}^{(n)}\left(\widehat{s}\right),a_{-j}^{(n)}\left(\widehat{s}\right),\widehat{s};V_{j}^{(n)}\right)}{\partial a_{j}^{(n)}\left(\widehat{s}\right)}

      where

      Qj(aj(n)(s^),aj(n)(s^),s^;Vj(n))aj(n)(s^)\displaystyle\frac{\partial Q_{j}(a_{j}^{(n)}\left(\widehat{s}\right),a_{-j}^{(n)}\left(\widehat{s}\right),\widehat{s};V_{j}^{(n)})}{\partial a_{j}^{(n)}\left(\widehat{s}\right)}
      \displaystyle\equiv rj(aj(n)(s^),aj(n)(s^),s)aj(n)(s^)+β[Vj¯(n)(s)p(s|s^,aj(n)(s^),aj(n)(s^))𝑑s]aj(n)(s^)\displaystyle\frac{\partial r_{j}\left(a_{j}^{(n)}\left(\widehat{s}\right),a_{-j}^{(n)}\left(\widehat{s}\right),s\right)}{\partial a_{j}^{(n)}\left(\widehat{s}\right)}+\beta\frac{\partial\left[\int\overline{V_{j}}^{(n)}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a_{j}^{(n)}\left(\widehat{s}\right),a_{-j}^{(n)}\left(\widehat{s}\right)\right)ds^{\prime}\right]}{\partial a_{j}^{(n)}\left(\widehat{s}\right)}
    2. (b)

      Find

      Vj(n)(s^)\displaystyle V_{j}^{(n*)}\left(\widehat{s}\right) \displaystyle\equiv ΦV,j(V(n),a(n))(s^)\displaystyle\Phi_{V,j}\left(V^{(n)},a^{(n)}\right)\left(\widehat{s}\right)
      =\displaystyle= rj(a(n)(s^),s^)+βVj¯(n)(s)p(s|s^,a(n)(s^))𝑑s\displaystyle r_{j}\left(a^{(n)}\left(\widehat{s}\right),\widehat{s}\right)+\beta\int\overline{V_{j}}^{(n)}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a^{(n)}\left(\widehat{s}\right)\right)ds^{\prime}

      for j𝒥,s^𝒮^j\in\mathcal{J},\widehat{s}\in\widehat{\mathcal{S}}

    3. (c)

      Update aa and VV in either of the following ways:

      • Simple: aj(n+1)(s^)aj(n)(s^)a_{j}^{(n+1)}\left(\widehat{s}\right)\leftarrow a_{j}^{(n*)}\left(\widehat{s}\right), Vj(n+1)(s^)Vj(n)(s^)V_{j}^{(n+1)}\left(\widehat{s}\right)\leftarrow V_{j}^{(n*)}\left(\widehat{s}\right)

      • Spectral algorithm: For zj{Vj,ajd=1,,D}z_{j}\in\left\{V_{j},a_{j}^{d=1,\cdots,D}\right\}, zj(n+1)(s^)αz(n)zj(n)(s^)+(1αz(n))zj(n)(s^)z_{j}^{(n+1)}\left(\widehat{s}\right)\leftarrow\alpha_{z}^{(n)}z_{j}^{(n*)}\left(\widehat{s}\right)+(1-\alpha_{z}^{(n)})z_{j}^{(n)}\left(\widehat{s}\right), where

        αz(n1)z(n)z(n1)2Fz(V(n),a(n))Fz(V(n1),a(n1))2\alpha_{z}^{(n\geq 1)}\equiv\frac{\left\|z^{(n)}-z^{(n-1)}\right\|_{2}}{\left\|F_{z}\left(V^{(n)},a^{(n)}\right)-F_{z}\left(V^{(n-1)},a^{(n-1)}\right)\right\|_{2}}, Fz(V,a)Φz(V,a)(s^)zF_{z}(V,a)\equiv\Phi_{z}\left(V,a\right)\left(\widehat{s}\right)-z

    4. (d)

      Exit the iteration if V(n+1)V(n)<ϵV\left\|V^{(n+1)}-V^{(n)}\right\|<\epsilon_{V} and ad(n+1)ad(n)<ϵad(d=1,,D)\left\|a^{d(n+1)}-a^{d(n)}\right\|<\epsilon_{a^{d}}\ (d=1,\cdots,D)

Algorithm 3 VF-PGI algorithm (Multi-agent dynamic game)

Intuitively, in the VF-PGI algorithm, we first set initial values of all agents’ value functions VV and actions aa. Then, if the gradient of agent jj’s “long-run profit” Qaj\frac{\partial Q}{\partial a_{j}} given their initial value functions and all agents’ actions is positive, we increase the value of agent jj’s action expecting agent jj’s larger long-run profit. If it is negative, we decrease the value. When the gradient is close to zero, the action is mostly optimal given value functions and all agents’ actions, and we rarely change the value of agent jj’s action.

Regarding value functions, we update the values of agent jj’s value functions so that they align with agent jj’s Bellman equations given their own initial value functions and all agents’ actions. Note that the initial actions in this step might not be optimal. We repeat the procedure for all agents, and iterate these steps until the initial and updated values are sufficiently close.

As in the single-agent version, we need no solve nonlinear problems in each iteration, and it contributes to smaller computational burden.

In the previous studies, an algorithm, termed VFI* algorithm, because it is a slight extension of the single-agent VFI algorithm, has been applied.666Corresponding algorithms have been applied in Pakes and McGuire, (1994), Miranda and Fackler, (2004), and subsequent studies. Note that the original algorithms in these studies update VV by using a(n+1)a^{(n+1)}, rather than a(n)a^{(n)}, in Step 2(b). To make the comparison with VF-PGI easier, I represent the algorithm based on the latter specification. I show the algorithm in Algorithm 4. The difference between VF-PGI (Algorithm 3) and VFI* (Algorithm 4) is Step 2(a). In VFI*, we must solve nonlinear equations many times; however, in VF-PGI, we need not solve nonlinear problems in this step.

  1. 1.

    Take grid points s^𝒮^\widehat{s}\in\widehat{\mathcal{S}}. Set initial values V(0){Vj(0)(s^)}j𝒥,s^𝒮^V^{(0)}\equiv\left\{V_{j}^{(0)}\left(\widehat{s}\right)\right\}_{j\in\mathcal{J},\widehat{s}\in\widehat{\mathcal{S}}} and a(0){aj(0)(s^)}j𝒥,s^𝒮^a^{(0)}\equiv\left\{a_{j}^{(0)}\left(\widehat{s}\right)\right\}_{j\in\mathcal{J},\widehat{s}\in\widehat{\mathcal{S}}}

  2. 2.

    Iterate the following (n=0,1,)(n=0,1,\cdots):

    1. (a)

      Solve for aj(n)(s^)=Φa(V(n),a(n))(s^)a_{j}^{(n*)}\left(\widehat{s}\right)=\Phi_{a}\left(V^{(n)},a^{(n)}\right)\left(\widehat{s}\right) satisfying the following nonlinear equation for each j𝒥,s^𝒮^j\in\mathcal{J},\widehat{s}\in\widehat{\mathcal{S}}:

      0\displaystyle 0 =\displaystyle= Q(aj(n)(s^),aj(n)(s^),s^;Vj(n))aj(n)(s^)\displaystyle\frac{\partial Q(a_{j}^{(n*)}\left(\widehat{s}\right),a_{-j}^{(n)}\left(\widehat{s}\right),\widehat{s};V_{j}^{(n)})}{\partial a_{j}^{(n*)}\left(\widehat{s}\right)}
      =\displaystyle= rj(aj(n)(s^),aj(n)(s^),s^)aj(n)(s^)+β[Vj¯(n)(s)p(s|s^,aj(n)(s^),aj(n)(s^))𝑑s]aj(n)(s^)\displaystyle\frac{\partial r_{j}\left(a_{j}^{(n*)}\left(\widehat{s}\right),a_{-j}^{(n)}\left(\widehat{s}\right),\widehat{s}\right)}{\partial a_{j}^{(n*)}\left(\widehat{s}\right)}+\beta\frac{\partial\left[\int\overline{V_{j}}^{(n)}\left(s^{\prime}\right)p(s^{\prime}|\widehat{s},a_{j}^{(n*)}\left(\widehat{s}\right),a_{-j}^{(n)}\left(\widehat{s}\right))ds^{\prime}\right]}{\partial a_{j}^{(n*)}\left(\widehat{s}\right)}
    2. (b)

      Find

      Vj(n)(s^)\displaystyle V_{j}^{(n*)}\left(\widehat{s}\right) \displaystyle\equiv ΦV,j(V(n),a(n))(s^)\displaystyle\Phi_{V,j}\left(V^{(n)},a^{(n)}\right)\left(\widehat{s}\right)
      =\displaystyle= rj(a(n)(s^),s^)+βVj¯(n)(s)p(s|s^,a(n)(s^))𝑑s\displaystyle r_{j}\left(a^{(n)}\left(\widehat{s}\right),\widehat{s}\right)+\beta\int\overline{V_{j}}^{(n)}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a^{(n)}\left(\widehat{s}\right)\right)ds^{\prime}

      for j𝒥,s^𝒮^j\in\mathcal{J},\widehat{s}\in\widehat{\mathcal{S}}

    3. (c)

      Update aa and VV:

      • Simple: aj(n+1)(s^)aj(n)(s^)a_{j}^{(n+1)}\left(\widehat{s}\right)\leftarrow a_{j}^{(n*)}\left(\widehat{s}\right), Vj(n+1)(s^)Vj(n)(s^)V_{j}^{(n+1)}\left(\widehat{s}\right)\leftarrow V_{j}^{(n*)}\left(\widehat{s}\right)

      • Spectral algorithm: For zj{Vj,ajd=1,,D}z_{j}\in\left\{V_{j},a_{j}^{d=1,\cdots,D}\right\}, zj(n+1)(s^)αz(n)zj(n)(s^)+(1αz(n))zj(n)(s^)z_{j}^{(n+1)}\left(\widehat{s}\right)\leftarrow\alpha_{z}^{(n)}z_{j}^{(n*)}\left(\widehat{s}\right)+(1-\alpha_{z}^{(n)})z_{j}^{(n)}\left(\widehat{s}\right), where αz(n1)z(n)z(n1)2Fz(V(n),a(n))Fz(V(n1),a(n1))2\alpha_{z}^{(n\geq 1)}\equiv\frac{\left\|z^{(n)}-z^{(n-1)}\right\|_{2}}{\left\|F_{z}\left(V^{(n)},a^{(n)}\right)-F_{z}\left(V^{(n-1)},a^{(n-1)}\right)\right\|_{2}}, Fz(V,a)Φz(V,a)(s^)zF_{z}(V,a)\equiv\Phi_{z}\left(V,a\right)\left(\widehat{s}\right)-z

    4. (d)

      Exit the iteration if V(n+1)V(n)<ϵV\left\|V^{(n+1)}-V^{(n)}\right\|<\epsilon_{V} and ad(n+1)ad(n)<ϵad(d=1,,D)\left\|a^{d(n+1)}-a^{d(n)}\right\|<\epsilon_{a^{d}}\ (d=1,\cdots,D)

Algorithm 4 VFI* algorithm (Multi-agent dynamic game)
Remark 3.

As in the VFI* algorithm, there is no guarantee that the iteration is a contraction. Furthermore, given the existence of multiple equilibria, there is no guarantee that we can find all equilibria even when starting from several initial values.

3.3 Spectral algorithm

We can speed up the convergence of the algorithm by introducing the spectral algorithm. To introduce the idea, we first discuss the fixed point iteration representation of the proposed VF-PGI algorithm. We then briefly discuss the spectral algorithm, which summarizes the detailed discussion in Fukasawa, (2024).

Fixed point iteration representation of the algorithm

VF-PGI (without applying the spectral algorithm) can be considered as a fixed point iteration. To see this, we consider the single-agent dynamic model discussed in Section 3.1. Let x=(V,a)x=\left(V,a\right), where V{V(s^)}s^𝒮^V\equiv\left\{V\left(\widehat{s}\right)\right\}_{\widehat{s}\in\widehat{\mathcal{S}}} and a{a(s^)}s^𝒮^a\equiv\left\{a\left(\widehat{s}\right)\right\}_{\widehat{s}\in\widehat{\mathcal{S}}}. We define a mapping Φ:BV×BaBV×Ba\Phi:B_{V}\times B_{a}\rightarrow B_{V}\times B_{a} such that Φ(x)Φ(V,a)(ΦV(V,a),Φa(V,a))\Phi(x)\equiv\Phi(V,a)\equiv\left(\Phi_{V}(V,a),\Phi_{a}(V,a)\right). Here, we define ΦV:BV×BaBV\Phi_{V}:B_{V}\times B_{a}\rightarrow B_{V} such that

ΦV(V,a)(s^)r(a(s^),s^)+βV¯(s)p(s|s^,a(s^))𝑑s,\Phi_{V}(V,a)\left(\widehat{s}\right)\equiv r(a\left(\widehat{s}\right),\widehat{s})+\beta\int\overline{V}\left(s^{\prime}\right)p\left(s^{\prime}|\widehat{s},a\left(\widehat{s}\right)\right)ds^{\prime},

and Φa:BV×BaBa\Phi_{a}:B_{V}\times B_{a}\rightarrow B_{a} such that

Φa(V,a)(s^)=a(s^)+λ[r(a(s^),s^)a(s^)+β[V¯(s)p(s|s^,a(s^))𝑑s]a(s^)].\Phi_{a}(V,a)\left(\widehat{s}\right)=a\left(\widehat{s}\right)+\lambda\cdot\left[\frac{\partial r\left(a\left(\widehat{s}\right),\widehat{s}\right)}{\partial a\left(\widehat{s}\right)}+\beta\frac{\partial\left[\int\overline{V}\left(s^{\prime}\right)p(s^{\prime}|\widehat{s},a\left(\widehat{s}\right))ds^{\prime}\right]}{\partial a\left(\widehat{s}\right)}\right].

V¯\overline{V} is an interpolated value based on the values of V{V(s^)}s^𝒮^V\equiv\left\{V\left(\widehat{s}\right)\right\}_{\widehat{s}\in\widehat{\mathcal{S}}}. BVB_{V} and BaB_{a} denote the domains of VV and aa, respectively. Hence, the proposed algorithm is equivalent to repeating the fixed point iteration x(n+1)=Φ(x(n))x^{(n+1)}=\Phi\left(x^{(n)}\right) until convergence with appropriate initial values x(0)(V(0),a(0))x^{(0)}\equiv\left(V^{(0)},a^{(0)}\right). Analogous representation is possible even when we consider multi-agent dynamic games.

Spectral algorithm

The spectral algorithm is designed to solve nonlinear equations and nonlinear continuous optimization problems. When we want to solve a nonlinear equation F(x)=0F(x)=0, x(n)x^{(n)} is iteratively updated as follows, given initial values of x(0)x^{(0)}:777Newton’s method, which uses the updating equation x(n+1)=x(n)(F(x(n)))1F(x(n))(n=0,1,2,)x^{(n+1)}=x^{(n)}-\left(\nabla F(x^{(n)})\right)^{-1}F\left(x^{(n)}\right)\ (n=0,1,2,\cdots), attains fast convergence around the solution. Although Newton’s method can be applied to solve the equation, computing F(x(n))\nabla F(x^{(n)}) requires coding analytical first derivatives of FF, which is not an easy task, especially when the function FF is complicated. Moreover, especially when nxn_{x}, the dimension of xx, is large, computing the inverse of nx×nxn_{x}\times n_{x} matrix F(x(n))\nabla F(x^{(n)}) is computationally costly. Hence, the use of the spectral algorithm is attractive from the perspective of simplicity and computational cost.

x(n+1)=x(n)+α(n)F(x(n))(n=0,1,2,)x^{(n+1)}=x^{(n)}+\alpha^{(n)}F\left(x^{(n)}\right)\ (n=0,1,2,\cdots)

Here, α(n)\alpha^{(n)} is generally taken based on the values of s(n)x(n)x(n1)s^{(n)}\equiv x^{(n)}-x^{(n-1)} and y(n)F(x(n))F(x(n1))y^{(n)}\equiv F(x^{(n)})-F(x^{(n-1)}), and it can vary across n=0,1,2,n=0,1,2,\cdots.

The idea of the spectral algorithm can be utilized in the context of fixed point iterations. Suppose we want to solve a fixed point constraint x=Φ(x)x=\Phi(x). Then, by letting F(x)=Φ(x)xF(x)=\Phi(x)-x, we can solve x=Φ(x)x=\Phi(x) by iteratively updating the values of xx by:888Spectral algorithm can be thought of as a generalization of extrapolation method (cf. Judd, (1998)) with varying values of α\alpha.

x(n+1)\displaystyle x^{(n+1)} =\displaystyle= x(n)+α(n)(Φ(x(n))x(n))(n=0,1,2,)\displaystyle x^{(n)}+\alpha^{(n)}\left(\Phi\left(x^{(n)}\right)-x^{(n)}\right)\ (n=0,1,2,\cdots)
=\displaystyle= α(n)Φ(x(n))+(1α(n))x(n)(n=0,1,2,).\displaystyle\alpha^{(n)}\Phi(x^{(n)})+(1-\alpha^{(n)})x^{(n)}\ (n=0,1,2,\cdots).

As discussed in detail in Varadhan and Roland, (2008) and Fukasawa, (2024),999Step size αS3s(n)2y(n)2\alpha_{S3}\equiv\frac{\left\|s^{(n)}\right\|_{2}}{\left\|y^{(n)}\right\|_{2}} can be derived from a simple optimization problem minα(n)x(n+1)x(n)22|α(n)|=s(n)+α(n)y(n)22|α(n)|.\min_{\alpha^{(n)}}\frac{\left\|x^{(n+1)}-x^{(n)}\right\|_{2}^{2}}{\left|\alpha^{(n)}\right|}=\frac{\left\|s^{(n)}+\alpha^{(n)}y^{(n)}\right\|_{2}^{2}}{\left|\alpha^{(n)}\right|}. For details, see Section 5 of Fukasawa, (2024). the step size α(n)=αS3s(n)2y(n)2>0\alpha^{(n)}=\alpha_{S3}\equiv\frac{\left\|s^{(n)}\right\|_{2}}{\left\|y^{(n)}\right\|_{2}}>0 works well. Algorithm 5 shows the detailed steps in the spectral algorithm under the step size αS3\alpha_{S3}.101010Previous studies (e.g., La Cruz et al.,, 2006) in the numerical literature have proposed globalization strategies to guarantee global convergence. However, the iteration without globalization strategies works well, as shown in Section 5, and I do not discuss them here.

  1. 1.

    Set initial values of x(0)x^{(0)}. Choose α(0)\alpha^{(0)} and tolerance level ϵ\epsilon.

  2. 2.

    Iterate the following (n=0,1,2,)(n=0,1,2,\cdots):

    1. (a)

      Compute Φ(x(n))\Phi\left(x^{(n)}\right) and F(x(n))=Φ(x(n))x(n)F\left(x^{(n)}\right)=\Phi\left(x^{(n)}\right)-x^{(n)}.

    2. (b)

      Compute s(n)x(n)x(n1),y(n)F(x(n))F(x(n1)),and α(n)=s(n)2y(n)2 ir n1.s^{(n)}\equiv x^{(n)}-x^{(n-1)},y^{(n)}\equiv F(x^{(n)})-F(x^{(n-1)}),\text{and }\alpha^{(n)}=\frac{\left\|s^{(n)}\right\|_{2}}{\left\|y^{(n)}\right\|_{2}}\text{\ ir\ }n\geq 1.

    3. (c)

      Compute x(n+1)=x(n)+α(n)F(x(n))x^{(n+1)}=x^{(n)}+\alpha^{(n)}F(x^{(n)})

    4. (d)

      If Φ(x(n))x(n)<ϵ\left\|\Phi(x^{(n)})-x^{(n)}\right\|<\epsilon, exit the iteration. Otherwise, go back to Step 2(a).

Algorithm 5 Spectral algorithm with fixed point mapping

Variable-type-specific step size

Although the standard spectral algorithm uses a scalar α(n)\alpha^{(n)}, I introduce the idea of variable-type-specific step sizes to the spectral algorithm.

To introduce the idea, suppose we would like to solve a nonlinear equation F(x)=(F1(x)F2(x))=0n1+n2F(x)=\left(\begin{array}[]{c}F_{1}(x)\\ F_{2}(x)\end{array}\right)=0\in\mathbb{R}^{n_{1}+n_{2}}, where x=(x1x2)n1+n2x=\left(\begin{array}[]{c}x_{1}\\ x_{2}\end{array}\right)\in\mathbb{R}^{n_{1}+n_{2}}. When using the standard step size αS3\alpha_{S3}, we set α(n)=s(n)2y(n)2\alpha^{(n)}=\frac{\left\|s^{(n)}\right\|_{2}}{\left\|y^{(n)}\right\|_{2}}, where s(n)(x1(n)x2(n))(x1(n1)x2(n1))s^{(n)}\equiv\left(\begin{array}[]{c}x_{1}^{(n)}\\ x_{2}^{(n)}\end{array}\right)-\left(\begin{array}[]{c}x_{1}^{(n-1)}\\ x_{2}^{(n-1)}\end{array}\right), y(n)(F1(x(n))F2(x(n)))(F1(x(n1))F2(x(n1)))y^{(n)}\equiv\left(\begin{array}[]{c}F_{1}(x^{(n)})\\ F_{2}(x^{(n)})\end{array}\right)-\left(\begin{array}[]{c}F_{1}(x^{(n-1)})\\ F_{2}(x^{(n-1)})\end{array}\right), and update xx by x(n+1)=x(n)+α(n)F(x(n))x^{(n+1)}=x^{(n)}+\alpha^{(n)}F\left(x^{(n)}\right).

Here, we can alternatively update the values of x=(x1x2)x=\left(\begin{array}[]{c}x_{1}\\ x_{2}\end{array}\right) by x(n+1)=(x1(n+1)x2(n+1))=(x1(n)x2(n))+(α1(n)F1(x(n))α2(n)F2(x(n))),x^{(n+1)}=\left(\begin{array}[]{c}x_{1}^{(n+1)}\\ x_{2}^{(n+1)}\end{array}\right)=\left(\begin{array}[]{c}x_{1}^{(n)}\\ x_{2}^{(n)}\end{array}\right)+\left(\begin{array}[]{c}\alpha_{1}^{(n)}F_{1}(x^{(n)})\\ \alpha_{2}^{(n)}F_{2}(x^{(n)})\end{array}\right), where αm(n)=sm(n)2ym(n)2\alpha_{m}^{(n)}=\frac{\left\|s_{m}^{(n)}\right\|_{2}}{\left\|y_{m}^{(n)}\right\|_{2}}, sm(n)xm(n)xm(n1)s_{m}^{(n)}\equiv x_{m}^{(n)}-x_{m}^{(n-1)}, ym(n)Fm(x(n))Fm(x(n1))(m=1,2)y_{m}^{(n)}\equiv F_{m}(x^{(n)})-F_{m}(x^{(n-1)})\ (m=1,2). The idea can be generalized to the case where NN\in\mathbb{N} types of variables exist.

As discussed earlier, α(n)\alpha^{(n)} is chosen to accelerate the convergence. In principle, any choices of α(n)\alpha^{(n)} are allowed in the spectral algorithm provided it works, and its numerical performance is more critical.111111We can justify the strategy with a simple thought experiment. Suppose x1(n)x_{1}^{(n)} is fixed at the true value x1x_{1}^{*}, and x2(n)x_{2}^{(n)} is not. Then, we should solve the equation F2(x2;x1)=0F_{2}(x_{2};x_{1}^{*})=0 as an equation of x2x_{2}, and setting α2(n)=s2(n)2y2(n)2\alpha_{2}^{(n)}=\frac{\left\|s_{2}^{(n)}\right\|_{2}}{\left\|y_{2}^{(n)}\right\|_{2}} is desirable. If we assume α1(n)=α2(n)=α(n)\alpha_{1}^{(n)}=\alpha_{2}^{(n)}=\alpha^{(n)}, we should set α(n)=s(n)2y(n)2=s1(n)22+s2(n)22y1(n)22+y2(n)22\alpha^{(n)}=\frac{\left\|s^{(n)}\right\|_{2}}{\left\|y^{(n)}\right\|_{2}}=\frac{\sqrt{\left\|s_{1}^{(n)}\right\|_{2}^{2}+\left\|s_{2}^{(n)}\right\|_{2}^{2}}}{\sqrt{\left\|y_{1}^{(n)}\right\|_{2}^{2}+\left\|y_{2}^{(n)}\right\|_{2}^{2}}}, which may not be equal to s2(n)2y2(n)2\frac{\left\|s_{2}^{(n)}\right\|_{2}}{\left\|y_{2}^{(n)}\right\|_{2}}.

4 Extension: Box constraint

So far, we have only considered settings in which first-order equality conditions of the optimization problems hold exactly. However, in some applications, the introduction of box constraints on the domain of actions, such as nonnegative constraints, may be suitable. For example, in the investment competition model discussed by Pakes and McGuire, (1994), each firm’s investment value should be nonnegative, and first-order equality conditions do not necessarily hold at boundary points. This section shows that we can easily incorporate box constraints into the proposed algorithm.

As discussed in Section 3.3, the proposed algorithm under no binding box constraints corresponds to solving a fixed point problem x=Φ(x)(ΦV(V,a),Φa(V,a))x=\Phi(x)\equiv\left(\Phi_{V}(V,a),\Phi_{a}(V,a)\right), where x(V,a)x\equiv(V,a). In contrast, in the setting where some of the box constraints are binding, we cannot directly derive fixed-point representations. For instance, in the case of the single-agent dynamic optimization problem, let ll and uu be the lower and upper bounds of the actions. Then, the following should hold:

Q(a(s),s;V)a(s){0ifa(s)=l=0ifa(s)(l,u)0ifa(s)=u\frac{\partial Q(a^{*}\left(s\right),s;V)}{\partial a^{*}\left(s\right)}\begin{cases}\leq 0&\text{if}\ a^{*}\left(s\right)=l\\ =0&\text{if}\ a^{*}\left(s\right)\in(l,u)\\ \geq 0&\text{if}\ a^{*}\left(s\right)=u\end{cases}

Motivated by this, suppose that we want to solve the following general mathematical problem:

fi(x){0ifxi=li=0ifxi(li,ui)0ifxi=ui,f_{i}(x)\begin{cases}\leq 0&\text{if}\ x_{i}=l_{i}\\ =0&\text{if}\ x_{i}\in(l_{i},u_{i})\\ \geq 0&\text{if}\ x_{i}=u_{i}\end{cases}, (10)

where r:BIIr:B\subset\mathbb{R}^{I}\rightarrow\mathbb{R}^{I} is a continuous function. Here, BΠi{1,,I}BiB\equiv\Pi_{i\in\{1,\cdots,I\}}B_{i}, Bi[li,ui]B_{i}\equiv[l_{i},u_{i}].

Suppose that a function Φ:BI\Phi:B\rightarrow\mathbb{R}^{I} satisfies:

fi(x)=0xi=Φi(x)f_{i}(x)=0\iff x_{i}=\Phi_{i}(x) (11)

Here, we define a function Φ^:BB\widehat{\Phi}:B\rightarrow B such that:

Φ^i(x){liiffi(x)0&xi=liuiiffi(x)0&xi=uiPBi(Φi(x))Otherwise,\widehat{\Phi}_{i}(x)\equiv\begin{cases}l_{i}&\text{if}\ f_{i}(x)\leq 0\ \&\ x_{i}=l_{i}\\ u_{i}&\text{if}\ f_{i}(x)\geq 0\ \&\ x_{i}=u_{i}\\ P_{B_{i}}\left(\Phi_{i}(x)\right)&\text{Otherwise}\end{cases},

where PB(x)P_{B}(x) denotes the projection of xx to a convex set BB.

Then, the following statement holds:

Proposition 1.

xx^{*} is the solution to (10) \iffxx^{*} is the solution to Φ^(x)=x\widehat{\Phi}(x)=x

Regarding the dynamic optimization problems with continuous actions, we can construct an alternative mapping Φ^\widehat{\Phi} from the fixed point mapping Φ\Phi under no binding constraints and the optimality conditions using Qa\frac{\partial Q}{\partial a}. Then, the problem is equivalent to solving x=Φ^(x)x=\widehat{\Phi}(x), and we can solve the problem using fixed-point iterations x(n+1)=Φ^(x(n))x^{(n+1)}=\widehat{\Phi}(x^{(n)}). Note that we can also introduce the spectral algorithm.

Regarding the continuity of the function Φ^\widehat{\Phi}, the following statement applies.

Proposition 2.

Φ^\widehat{\Phi} is a continuous function, if Φ\Phi is continuous, limxili+0,fi(x)0Φi(x)li\lim_{x_{i}\rightarrow l_{i}+0,f_{i}(x)\leq 0}\Phi_{i}(x)\leq l_{i}, and limxiui0,fi(x)0Φi(x)ui\lim_{x_{i}\rightarrow u_{i}-0,f_{i}(x)\geq 0}\Phi_{i}(x)\geq u_{i}.

Then, we can easily derive the following statement for the mapping of VF-PGI, which we denote ΦVFPGI\Phi^{VF-PGI}:

Corollary 1.

Suppose that ΦVFPGI\Phi^{VF-PGI} is a continuous function. Then, ΦVFPGI^\widehat{\Phi^{VF-PGI}}, derived from ΦVFPGI\Phi^{VF-PGI} by the box constraint on the domain of actions aa, is continuous.

5 Numerical experiments

This section uses numerical experiments to show the performance of the proposed VF-PGI-Spectral algorithm under three settings. Section 5.1 shows results under the single-agent neoclassical growth model with elastic labor supply, which is extensively considered in the literature on macroeconomics. Section 5.2 considers a dynamic investment competition model with continuous states. Section 5.3 considers a dynamic investment competition model with discrete states, examined by Pakes and McGuire, (1994). All experiments were conducted on a laptop computer with the CPU AMD Ryzen 5 6600H 3.30 GHz, 16.0 GB of RAM, Windows 11 64-bit, and MATLAB 2022b. When applying the spectral algorithm, we use α0=1\alpha_{0}=1 in this section. For details of each experiment, see Appendix B.

5.1 Single-agent neoclassical growth model with elastic labor supply

Settings

As in Maliar and Maliar, (2013), we consider a standard single-agent neoclassical growth model with elastic labor supply. The dynamic optimization problem is characterized by:

V(k,z)\displaystyle V(k,z) =\displaystyle= maxl,cu(c,l)+βE[V(k,z)]\displaystyle\max_{l,c}u(c,l)+\beta E\left[V(k^{\prime},z^{\prime})\right]
s.t.\displaystyle s.t. k=(1δ)k+zr(k,l)c\displaystyle k^{\prime}=(1-\delta)k+zr(k,l)-c
lnz=ρlnz+ϵ,ϵN(0,σ2)\displaystyle\ln z^{\prime}=\rho\ln z+\epsilon^{\prime},\ \epsilon^{\prime}\sim N(0,\sigma^{2})
u(c,l)=c1γ11γ+B(1l)1μ11μ\displaystyle u(c,l)=\frac{c^{1-\gamma}-1}{1-\gamma}+B\frac{(1-l)^{1-\mu}-1}{1-\mu}
r(k,l)=Akαl1α,\displaystyle r(k,l)=Ak^{\alpha}l^{1-\alpha},

where k,c,l,zk,c,l,z are the capital, consumption, labor, and productivity level.

First order conditions concerning ll and cc are:

0=Q(l,c,k,z;V)l(k,z)\displaystyle 0=\frac{\partial Q(l,c,k,z;V)}{\partial l(k,z)} =\displaystyle= ul(c,l)+AzflβE[V(k,z)]k\displaystyle u_{l}(c,l)+Azf_{l}\cdot\beta\frac{\partial E\left[V(k^{\prime},z^{\prime})\right]}{\partial k^{\prime}}\, (13)
0=Q(l,c,k,z;V)c(k,z)\displaystyle 0=\frac{\partial Q(l,c,k,z;V)}{\partial c(k,z)} =\displaystyle= uc(c,l)βE[V(k,z)]k\displaystyle u_{c}(c,l)-\beta\frac{\partial E\left[V(k^{\prime},z^{\prime})\right]}{\partial k^{\prime}} (14)

In this study, we try the following algorithms: VF-PGI-Spectral, VFI-Spectral, ECM-Spectral, EGM-Spectral, VFI, ECM, EGM. In the VF-PGI-Spectral algorithm, we jointly update the values of value functions VV and actions l,cl,c until convergence based on equations (5.1), (13), (14) by combining the spectral algorithm.

It should be noted that c=(B(1l)μ(1α)Azkαlα)1γc=\left(\frac{B(1-l)^{-\mu}}{(1-\alpha)Azk^{\alpha}l^{-\alpha}}\right)^{\frac{-1}{\gamma}} holds by equations (13) and (14). Hence, given the values of ll, we can analytically compute the values of cc. Therefore, we can alternatively consider an algorithm jointly updating value functions VV and an actions ll until convergence. In the algorithm, the values of cc are recovered by an analytical formula c=(B(1l)μ(1α)Azkαlα)1γc=\left(\frac{B(1-l)^{-\mu}}{(1-\alpha)Azk^{\alpha}l^{-\alpha}}\right)^{\frac{-1}{\gamma}}. We denote the algorithm as VF-PGI*-Spectral, and we also validate the performance of the algorithm.

The experiments are conducted by modifying and extending the replication code of Maliar and Maliar, (2013). As in Maliar and Maliar, (2013), we choose steady-state capital-output ratio πk=10\pi_{k}=10, steady-state consumption-output ratio πc=3/4\pi_{c}=3/4, steady-state labor l¯=1/3\overline{l}=1/3, and parameters A=1/β(1δ)α,B=(1α)πk(1γ)α/(1α)πcγ(1l¯)μlμ,γ=2,μ=2,α=1/3,β=0.99,δ=0.025,ρ=0.95,σ=0.01A=\frac{1/\beta-(1-\delta)}{\alpha},B=(1-\alpha)\pi_{k}^{(1-\gamma)\alpha/(1-\alpha)}\pi_{c}^{-\gamma}(1-\overline{l})^{\mu}l^{-\mu},\gamma=2,\mu=2,\alpha=1/3,\beta=0.99,\delta=0.025,\rho=0.95,\sigma=0.01.

Results

Table 1 shows the performance of the algorithms. The VF-PGI-Spectral algorithm converges significantly faster than other methods, including VFI, EGM, and ECM. We can also see that introducing the spectral algorithm significantly accelerates the convergence of VFI, EGM, and ECM. Still, VF-PGI-Spectral is superior. As noted in Maliar and Maliar, (2013), in the neoclassical growth model with elastic labor supply, neither EGM nor ECM can completely avoid solving nonlinear equations within the iterations. In contrast, VF-PGI-Spectral avoids such a problem. As shown in the last column, the CPU time per iteration for VF-PGI-Spectral is significantly less than that of other methods. Note that regardless of the choice of algorithms, the numerical accuracy rarely changes.

The table also shows the performance of VF-PGI*-Spectral, using the analytical formula between ll and cc. The results show that VF-PGI*-Spectral requires fewer iterations than VF-PGI-Spectral, and converges slightly faster than VF-PGI-Spectral. The findings indicate that the convergence might be accelerated by reducing the number of variables to be solved using analytical formulas.

Table 1: Speed and accuracy of algorithms (Single-agent neoclassical growth model with elastic labor supply)
Method CPU (sec) L1L_{1} LL_{\infty} Iter. CPU / Iter.
VF-PGI-Spectral 0.261 -5.425 -3.983 138 0.002
VF-PGI*-Spectral 0.153 -5.425 -3.983 98 0.002
VFI-Spectral 1.171 -5.426 -3.973 52 0.023
ECM-Spectral 0.650 -5.412 -3.958 39 0.017
EGM-Spectral 0.492 -5.345 -3.825 37 0.013
VFI 15.577 -5.425 -3.983 844 0.018
ECM 9.580 -5.406 -3.960 844 0.011
EGM 4.607 -5.345 -3.825 413 0.011

Notes. α0=1,λ=107\alpha_{0}=1,\lambda=10^{-7}.

L1L_{1} and LL_{\infty} denotes the average and the maximum absolute values of Euler residuals in log 10 units on a stochastic simulation of 10,000 observations.

Regarding VF-PGI-Spectral, we must specify an exogenous tuning parameter λ\lambda. In the setting shown in Table 1, we chose λ=107\lambda=10^{-7}.121212Under α0=1\alpha_{0}=1, a(n=1)=a(n=0)+λQa(a(n=0))a^{(n=1)}=a^{(n=0)}+\lambda\frac{\partial Q}{\partial a}\left(a^{(n=0)}\right) holds. If λ\lambda is excessively large, a(n=1)a^{(n=1)} can be too far from a(n=0)a^{(n=0)}and the true solution, and the iteration may become unstable. In the current setting, QQ is of order 105,10^{5},and |Qa|\left|\frac{\partial Q}{\partial a}\right| can be very large. Hence, there is a need to choose small λ\lambda like 10710^{-7}. Table 2 shows how the results vary when λ\lambda varies.

The results suggest that taking a sufficiently small value of λ\lambda is essential for the convergence of VF-PGI-Spectral. Note that the size of λ\lambda has no significant effect on the convergence speed of the algorithm as long as it is sufficiently small, and choosing the adequate value of λ\lambda is not critical as long as it leads to convergence.

Table 2: Effect of λ\lambda in the VF-PGI-Spectral algorithm (Single-agent neoclassical growth model with elastic labor supply)
Method log10(λ)\log_{10}(\lambda) CPU (sec) L1L_{1} LL_{\infty} Iter. Conv.
VF-PGI-Spectral -10 0.22 -5.425 -3.983 108 1
VF-PGI-Spectral -9 0.278 -5.425 -3.983 145 1
VF-PGI-Spectral -8 0.264 -5.425 -3.983 135 1
VF-PGI-Spectral -7 0.279 -5.425 -3.983 140 1
VF-PGI-Spectral -6 0.092 NaN NaN 38 0
VF-PGI-Spectral -5 0.016 NaN NaN 5 0

Notes. α0=1\alpha_{0}=1.

L1L_{1} and LL_{\infty} denotes the average and the maximum absolute values of Euler residuals in log 10 units on a stochastic simulation of 10,000 observations.

We let Conv.=0 if it diverges.

Remark 4.

Note that even when λ=1\lambda=1, we can successfully solve the dynamic model by setting a sufficiently small value of α0\alpha_{0}, which represents the initial step size of the spectral algorithm. As in the case of λ\lambda, the size of α0\alpha_{0} does not significantly affect the convergence speed of the algorithm as long as it convergences. Appendix C.1 shows the numerical results.

Remark 5.

As shown in Appendix C.2 , ECM-Spectral slightly outperforms VF-PGI-Spectral in the model with inelastic labor supply. In the model, both VF-PGI-Spectral and EGM-Spectral totally avoid solving nonlinear equations inside iterations. Unlike VF-PGI-Spectral, which essentially solves two types of variables (value functions and actions), ECM-Spectral requires solving only for value functions. Thus, using analytical formulas to reduce the number of variables to be solved may lead to faster convergence. For general models where deriving precise analytical formulas is challenging, VF-PGI-Spectral would outperform other methods, despite the potential increase in the number of iterations.

5.2 Dynamic investment competition model with continuous states

Settings

We examine a simple stationary dynamic investment competition model.131313A nonstationary version of the model was empirically applied in Fukasawa and Ohashi, (2023), and an analogous model was also considered in Miranda and Fackler, (2004). Firms j=1,,Jj=1,\cdots,J produce and sell a homogeneous product in a single market in each period. Each firm makes investment decisions to lower the marginal production costs. Firm jj’s dynamic optimization problem related to its investment is given by:

maxij(s)\displaystyle\max_{i_{j}(s)\in\mathbb{R}} πj(s)c(kj,ij(s))+βEzVj(kj,kj,z)\displaystyle\pi_{j}(s)-c\left(k_{j},i_{j}(s)\right)+\beta E_{z}V_{j}\left(k_{j}^{\prime},k_{-j}^{\prime},z^{\prime}\right)
s.t.kj=(1δ)kj+ij\displaystyle s.t.\ k_{j}^{\prime}=(1-\delta)k_{j}+i_{j}

Here, πj(s)\pi_{j}(s) denotes firm jj’s per-period profit from producing and selling profit given states s(k,z)s\equiv(k,z). k(kj,kj)k\equiv\left(k_{j},k_{-j}\right) denotes all firms’ capital stocks, and z(ξ,μ)z\equiv(\xi,\mu) denotes the exogenous state variables. c(kj,ij(s))c\left(k_{j},i_{j}(s)\right) denotes firm jj’s investment cost given its capital kjk_{j} and investment ij(s)i_{j}(s), and assume c(k,i)=θki+θai2kc(k,i)=\theta_{k}i+\theta_{a}\frac{i^{2}}{k}. kj=(1δ)kj+ijk_{j}^{\prime}=(1-\delta)k_{j}+i_{j} represents the deterministic state transition of firm jj’s capital stock. δ\delta denotes the depreciation rate of the capital.

Assuming constant marginal production costs, we define mcj(kj,μ)=kjγexp(μ)mc_{j}\left(k_{j},\mu\right)=k_{j}^{-\gamma}\cdot\exp\left(\mu\right) as the marginal cost of firm jj given its capital stock kjk_{j}. μ\mu denotes stochastic cost shock. Firms choose outputs in the market, and assume static Cournot competition with regard to the output choice. Let Q(P)=Pηexp(ξ)Q(P)=P^{-\eta}\cdot\exp\left(\xi\right) be the demand function of the product, where QQ denotes the aggregate quantity, PP denotes the price, and ξ\xi denotes stochastic demand shock. η\eta denotes a parameter corresponding to the price elasticity of demand. Then, firm jj’s profit function can be represented as πj(s)=(P(s)mcj(s))qj(s)\pi_{j}(s)=\left(P(s)-mc_{j}(s)\right)q_{j}(s). Given the values of the state variables s(kj,kj;ξ,μ)s\equiv\left(k_{j},k_{-j};\xi,\mu\right) and parameters, we can compute the profits of the firms.

Regarding the exogenous shocks, we assume stationary AR(1) transitions ξ=ρξ(ξξ¯)+ξ¯+uξ\xi^{\prime}=\rho^{\xi}(\xi-\overline{\xi})+\overline{\xi}+u^{\xi} where uξN(0,σξ2)u^{\xi}\sim N(0,\sigma_{\xi}^{2}), and μ=ρμ(μμ¯)+μ¯+uμ\mu^{\prime}=\rho^{\mu}(\mu-\overline{\mu})+\overline{\mu}+u^{\mu} where uμN(0,σμ2)u^{\mu}\sim N(0,\sigma_{\mu}^{2}).

The parameters are specified as follows: α=1.5\alpha=1.5, β=0.9\beta=0.9, δ=0.08\delta=0.08, ρ1ξ=ρ1μ=0.9\rho_{1}^{\xi}=\rho_{1}^{\mu}=0.9, σξ=σμ=0.01\sigma_{\xi}=\sigma_{\mu}=0.01, ξ¯=4\overline{\xi}=4, μ¯=2\overline{\mu}=2, and θk=0.12\theta_{k}=0.12, θa=0.3\theta_{a}=0.3. Regarding VF-PGI-Spectral, we let α0=1\alpha_{0}=1 and λ=0.001\lambda=0.001.

Results

Table 3 shows the results. Regardless of the number of firms, VF-PGI-Spectral outperforms VFI* based methods. Furthermore, the numerical accuracy of the solutions was rarely impacted by the choice of the algorithms.

Table 3: Speed and accuracy of algorithms (Dynamic investment competition model with continuous states)
JJ Method CPU (sec) L1L_{1} LL_{\infty} Iter. CPU / Iter.
1 VF-PGI-Spectral 0.034 -5.924 -5.38 48 0.001
VFI*-Spectral 0.083 -5.924 -5.38 39 0.002
VFI* 0.191 -5.924 -5.38 98 0.002
2 VF-PGI-Spectral 0.103 -6.697 -5.958 84 0.001
VFI*-Spectral 0.543 -6.697 -5.958 89 0.006
VFI* 0.689 -6.697 -5.958 141 0.005
3 VF-PGI-Spectral 0.741 -7.378 -6.245 86 0.009
VFI*-Spectral 3.765 -7.378 -6.245 63 0.06
VFI* 5.957 -7.378 -6.245 141 0.042

L1L_{1} and LL_{\infty} denotes the average and the maximum of log10(Qj(aj(s^),aj(s^),s^;Vj)aj(s^))\log_{10}\left(\left\|\frac{\partial Q_{j}(a_{j}\left(\widehat{s}\right),a_{-j}\left(\widehat{s}\right),\widehat{s};V_{j})}{\partial a_{j}\left(\widehat{s}\right)}\right\|_{\infty}\right) on a stochastic simulation of 100 observations.

5.3 Dynamic investment competition model with discrete states (Pakes and McGuire,, 1994)

Settings

We consider the dynamic investment competition model with discrete states following Pakes and McGuire, (1994).141414To clarify the notations, we slightly alter the notations in Pakes and McGuire, (1994). Investment iji_{j} in the current paper corresponds to xjx_{j} in Pakes and McGuire, (1994), and state transition parameter γ\gamma in the current paper corresponds to aa in Pakes and McGuire, (1994). Single product firms j=1,,Jj=1,\cdots,J produce and sell differentiated products in a single market in each period. Firms make investment decisions to improve their product quality. Firm jj’s dynamic optimization problem related to its investment is given by:

Vj(w)=maxij(w)0Qj(ij(w),ij(w),w;Vj)πj(w)c(ij(w))+βEw[Vj(w)|w,ij(w),ij(w)]V_{j}(w)=\max_{i_{j}(w)\geq 0}Q_{j}(i_{j}(w),i_{-j}^{*}(w),w;V_{j})\equiv\pi_{j}(w)-c\left(i_{j}(w)\right)+\beta E_{w^{\prime}}\left[V_{j}(w^{\prime})|w,i_{j}(w),i_{-j}^{*}(w)\right]

Here, πj(w)\pi_{j}(w) denotes firm jj’s per-period profit function given states w(wj,wj)w\equiv(w_{j},w_{-j}), and c(i)c\left(i\right) denotes the investment cost function. wj{1,2,,K¯}w_{j}\in\{1,2,\cdots,\overline{K}\} denotes firm jj’s capital stock.

State transition of ww is given by wj=wj+τjνw_{j}^{\prime}=w_{j}+\tau_{j}-\nu. Here, τj\tau_{j} represents the output of firm jj’s own investment process, and its probability distribution is given by:

Pr(τj)={γij(w)1+γij(w)ifτj=111+γij(w)ifτj=0.Pr(\tau_{j})=\begin{cases}\frac{\gamma i_{j}(w)}{1+\gamma i_{j}(w)}&\text{if}\ \tau_{j}=1\\ \frac{1}{1+\gamma i_{j}(w)}&\text{if}\ \tau_{j}=0\end{cases}.

ν\nu denotes the industry-wide depreciation shock, and its probability distribution is given by:

Pr(ν)={δifν=11δifν=0.Pr(\nu)=\begin{cases}\delta&\text{if}\ \nu=1\\ 1-\delta&\text{if}\ \nu=0\end{cases}.

In this study, we assume the following functional form of investment cost function c(i)c\left(i\right): c(i)=i+θ2i2c\left(i\right)=i+\theta_{2}i^{2}. The term θ2i2\theta_{2}i^{2} corresponds to the quadratic investment cost.151515Adjustment costs of investment, in the form of quadratic investment costs, have been thoroughly examined in the macroeconomics literature (e.g., Cooper and Haltiwanger,, 2006). See also the discussions in Mermelstein et al., (2020).

We assume that each consumer purchases at most one product in each period. Consumer ii’s utility when purchasing firm jj’s product is Uij=g(wj)pj+ϵijU_{ij}=g(w_{j})-p_{j}+\epsilon_{ij}, where g(wj)g(w_{j}) denotes the quality of the product produced by firm jj at state wjw_{j}. Consumer ii’s utility from buying nothing is ϵi0\epsilon_{i0}. pjp_{j} denotes firm jj’s product price, and {ϵij}j𝒥{0}\left\{\epsilon_{ij}\right\}_{j\in\mathcal{J}\cup\{0\}} denotes idiosyncratic utility shocks following i.i.d. Gumbel distribution. Then, the demand for firm jj’s product is qj(w;p)=Mexp(g(wj)pj)1+k=1Jexp(g(wk)pk)q_{j}(w;p)=M\frac{\exp\left(g(w_{j})-p_{j}\right)}{1+\sum_{k=1}^{J}\exp\left(g(w_{k})-p_{k}\right)}, where MM denotes the market size. Regarding g(w)g(w), we assume g(w)=wg(w)=w for www\leq w^{*} and g(w)=w+log(2exp(w+w))g(w)=w+\log\left(2-\exp\left(-w+w^{*}\right)\right) for w>ww>w^{*}.

Firms choose their product prices in each period, observing the quality of competitors’ products. Let pj(w)p_{j}^{*}(w) be the equilibrium price of firm jj’s product. Then, the profit function of firm jj can be represented as πj(w)=(pj(w)mc)qj(w;p)\pi_{j}(w)=\left(p_{j}^{*}(w)-mc\right)q_{j}\left(w;p^{*}\right), where mcmc denotes the marginal cost of products, which we assume to be common for all products. Given the state variables w(wj,wj)w\equiv\left(w_{j},w_{-j}\right) and parameters, we can compute the firms’ profits. As in Pakes and McGuire, (1994), we use the following parameter values: δ=0.7,mc=5,M=5,K¯=19,w=12,β=0.925.\delta=0.7,mc=5,M=5,\overline{K}=19,w^{*}=12,\beta=0.925.

The experiments are conducted by modifying the replication code of Pakes and McGuire, (1994).161616The replication code of Pakes and McGuire, (1994) was downloaded from https://scholar.harvard.edu/pakes/pages/pakes-maguire-algorithm-0. To simplify the setting, I modified the original code so that no entry/exit decisions exist. Note that the relative performance of the VF-PGI-Spectral algorithm rarely changes even when firms’ endogenous entry/exit decisions are introduced.Regarding VF-PGI-Spectral, we let λ=0.01\lambda=0.01.

We assume nonnegative investments by firms. By combining the discussion in Section 4, in VF-PGI-Spectral, we update the values of iji_{j} as follows:

ij(n+1)(w)={ij(n)(w)ifij(i)(w)=0&Qj(i(n)(w),w;V(n))ij(n)(w)(w)<0ij(n)(w)+λQj(i(n)(w),w;V(n))ij(n)(w)(w)Otherwisei_{j}^{(n+1)}(w)=\begin{cases}i_{j}^{(n)}(w)&\text{if}\ i_{j}^{(i)}(w)=0\ \&\ \frac{\partial Q_{j}(i^{(n)}(w),w;V^{(n)})}{\partial i_{j}^{(n)}(w)}(w)<0\\ i_{j}^{(n)}(w)+\lambda\cdot\frac{\partial Q_{j}(i^{(n)}(w),w;V^{(n)})}{\partial i_{j}^{(n)}(w)}(w)&\text{Otherwise}\end{cases}

Results

First, Table 4 shows the findings in the setting where there are no quadratic investment costs (θ2=0)(\theta_{2}=0). In the specific setting, as discussed in Pakes et al., (1993), we can show that optimal iji_{j} given value functions VV and competitors’ actions iji_{-j}^{*} satisfies ij(w)=max{1+βγ(v1j(w)v2j(w))γ,0}(j=1,,J;wΩ)i_{j}^{*}(w)=\max\left\{\frac{-1+\sqrt{\beta\gamma\left(v_{1j}(w)-v_{2j}(w)\right)}}{\gamma},0\right\}\ \ (j=1,\cdots,J;w\in\Omega), where v1j(w)Ew[Vj(w)|w,τj=1,ij(w)]v_{1j}(w)\equiv E_{w^{\prime}}\left[V_{j}(w^{\prime})|w,\tau_{j}=1,i_{-j}^{*}(w)\right] and v2j(w)=Ew[Vj(w)|w,τj=0,ij(w)]v_{2j}(w)=E_{w^{\prime}}\left[V_{j}(w^{\prime})|w,\tau_{j}=0,i_{-j}^{*}(w)\right]. This implies nonlinear optimization problems inside iterations can be avoided by using the analytical solution. This leads to lower a computational burden per iteration in the VFI* algorithm. As shown in the table, VF-PGI-Spectral and VFI*-Spectral perform analogously.

In contrast, in the general setting where quadratic adjustment costs exist (θ2>0)(\theta_{2}>0), there are no analytical formulas related to optimal investments. Table 5 shows the results under θ2=1\theta_{2}=1, and we can see that VF-PGI-Spectral outperforms VFI*-Spectral.

Table 4: Speed and accuracy of algorithms (Dynamic investment competition model with discrete states; θ2=0\theta_{2}=0; Without quadratic investment cost)
JJ Method CPU(secs) Iter. L(V)L_{\infty}(V) L(i)L_{\infty}(i)
1 VF-PGI-Spectral 0.28 74 -6.09 -6.41
VFI*-Spectral (analytical sol) 0.28 95 -6.05 -6.26
VFI* (analytical sol) 0.42 219 -6.02 -7.53
2 VF-PGI-Spectral 3.41 103 -6.11 -6.6
VFI*-Spectral (analytical sol) 3.06 99 -6.03 -6.93
VFI* (analytical sol) 5.74 196 -6.02 -7.36
3 VF-PGI-Spectral 74.11 129 -6.02 -6.25
VFI*-Spectral (analytical sol) 59.63 102 -6.02 -6.73
VFI* (analytical sol) 124.67 206 -6.02 -7.59

Notes. L(V)L_{\infty}(V) denotes the value of log10(ΦV(V,a)V)\log_{10}\left(\left\|\Phi_{V}(V,a)-V\right\|_{\infty}\right). L(i)L_{\infty}(i) is defined analogously.

Table 5: Speed and accuracy of algorithms (Dynamic investment competition model with discrete states; θ2=1\theta_{2}=1; With quadratic investment cost)
JJ Method CPU(secs) Iter. L(V)L_{\infty}(V) L(i)L_{\infty}(i)
1 VF-PGI-Spectral 0.25 66 -6.94 -6.26
VFI*-Spectral 0.46 53 -7.26 -7.03
2 VF-PGI-Spectral 3.76 111 -6.17 -6.88
VFI*-Spectral 12.3 98 -6.2 -7.5
3 VF-PGI-Spectral 63.08 97 -6.59 -6.63
VFI*-Spectral 191.63 77 -6.1 -7.01

Notes. L(V)L_{\infty}(V) denotes the value of log10(ΦV(V,a)V)\log_{10}\left(\left\|\Phi_{V}(V,a)-V\right\|_{\infty}\right). L(i)L_{\infty}(i) is defined analogously.

6 Relationships with previously proposed methods

6.1 ECM, EE, and EGM

To ease the computation of single-agent dynamic optimization problems mainly applied in the macroeconomics literature, several methods (ECM, EE, EGM) have been proposed. These methods avoid nonlinear optimization problems in the iterations, leading to a lower computational burden. However, these methods are not directly applicable to multi-agent dynamic games. This section clarifies this point by considering the dynamic investment competition with continuous states considered in Section 5.2 as an example. To simplify equations, we consider the setting c(i)=θ1i+θ2i2c(i)=\theta_{1}i+\theta_{2}i^{2}. Moreover, we assume that the exogenous states ztz_{t} are constant over time, and we do not explicitly treat them as states.

6.1.1 Envelope Condition Method (ECM; Maliar and Maliar,, 2013; Arellano et al.,, 2016)

In the dynamic investment competition model, envelope condition is:171717See Appendix A.3 for the derivation.

Vj(kj,kj)kj=πj(kj,kj)kj+(1δ)c(kj(1δ)kj)+βkjkjVj(kj,kj)kj\frac{\partial V_{j}\left(k_{j},k_{-j}\right)}{\partial k_{j}}=\frac{\partial\pi_{j}\left(k_{j},k_{-j}\right)}{\partial k_{j}}+(1-\delta)c^{\prime}\left(k_{j}^{\prime}-(1-\delta)k_{j}\right)+\beta\frac{\partial k_{-j}^{\prime}}{\partial k_{j}}\frac{\partial V_{j}\left(k_{j}^{\prime},k_{-j}^{\prime}\right)}{\partial k_{-j}^{\prime}} (15)

If firm jj alone exists, the last term disappears, and we have:

Vj(kj,kj)kj=πj(kj,kj)kj+(1δ)(θ1+2θ2ij)\frac{\partial V_{j}(k_{j},k_{-j})}{\partial k_{j}}=\frac{\partial\pi_{j}(k_{j},k_{-j})}{\partial k_{j}}+(1-\delta)\left(\theta_{1}+2\theta_{2}i_{j}\right)

It implies we can analytically solve for ijti_{jt} given the value of Vj(kj,kj)V_{j}(k_{j},k_{-j}). Hence, we can consider the following algorithm to solve for VV when only firm jj exists:

  1. 1.

    Take grid points kjk_{j}. Set initial values of Vj(0)(kj)V_{j}^{(0)}(k_{j}). Iterate the following until convergence (n=0,1,)(n=0,1,\cdots):

  2. 2.

    Given Vj(n)V_{j}^{(n)},

    1. (a)

      Analytically solve for ij(n+1)(kj)i_{j}^{(n+1)}(k_{j}) satisfying Vj(n)(kj)kj=πj(kj)kj+(1δ)(θ1+2θ2ij(n+1)(kj))\frac{\partial V_{j}^{(n)}(k_{j})}{\partial k_{j}}=\frac{\partial\pi_{j}(k_{j})}{\partial k_{j}}+(1-\delta)\left(\theta_{1}+2\theta_{2}i_{j}^{(n+1)}\left(k_{j}\right)\right)

    2. (b)

      Update Vj(kj)V_{j}(k_{j}) by : Vj(n+1)(kj)=πj(kj)c(ij(n+1)(kj))+βVj(n)((1δ)kj+ij(n+1)(kj))V_{j}^{(n+1)}(k_{j})=\pi_{j}(k_{j})-c\left(i_{j}^{(n+1)}(k_{j})\right)+\beta V_{j}^{(n)}\left((1-\delta)k_{j}+i_{j}^{(n+1)}(k_{j})\right)

In contrast, in the model involving multiple firms, the term βkjkjVj(kj,kj)kj\beta\frac{\partial k_{-j}^{\prime}}{\partial k_{j}}\frac{\partial V_{j}(k_{j}^{\prime},k_{-j}^{\prime})}{\partial k_{-j}^{\prime}} does not disappear, and we need knowledge of the values of the term kjkj\frac{\partial k_{-j}^{\prime}}{\partial k_{j}} to analytically solve for ij(n+1)(kj)i_{j}^{(n+1)}(k_{j}). However, the term cannot be directly recovered from VV, and the idea of ECM is not directly applicable.

6.1.2 Euler Equation (EE) method (cf. Judd,, 1998)

The Euler equation for the model is:181818See Appendix A.3 for the derivation. kj′′k_{j}^{\prime\prime} denotes firm jj’s capital two periods later than the current period.

0=c(kj(1δ)kj)+βπj(kj,kj)kj+β(1δ)c(kj′′(1δ)kj)+β2kj′′kjVj(kj′′,kj′′)kj′′0=-c^{\prime}\left(k_{j}^{\prime}-(1-\delta)k_{j}\right)+\beta\frac{\partial\pi_{j}(k_{j}^{\prime},k_{-j}^{\prime})}{\partial k_{j}^{\prime}}+\beta(1-\delta)c^{\prime}\left(k_{j}^{\prime\prime}-(1-\delta)k_{j}^{\prime}\right)+\beta^{2}\frac{\partial k_{-j}^{\prime\prime}}{\partial k_{j}^{\prime}}\frac{\partial V_{j}(k_{j}^{\prime\prime},k_{-j}^{\prime\prime})}{\partial k_{-j}^{\prime\prime}} (16)

If only one firm exists and c(i)=θ1i+θ2i2c(i)=\theta_{1}i+\theta_{2}i^{2}, the last term disappears, and

0=θ1θ2ij(kj)+βπj((1δ)kj+ij(kj))ij+β(1δ)[θ1+θ2ij]0=-\theta_{1}-\theta_{2}i_{j}(k_{j})+\beta\frac{\partial\pi_{j}\left((1-\delta)k_{j}+i_{j}(k_{j})\right)}{\partial i_{j}}+\beta(1-\delta)\left[\theta_{1}+\theta_{2}i_{j}^{\prime}\right]

holds.

Hence, we can consider the following algorithm:

  1. 1.

    Take grid points kjk_{j}. Set initial values of ij(0)(kj)i_{j}^{(0)}(k_{j}). Iterate the following until convergence (n=0,1,)(n=0,1,\cdots):

  2. 2.

    Given ij(n)(kj)i_{j}^{(n)}(k_{j}), analytically solve for ij(n+1)(kj)i_{j}^{(n+1)}(k_{j}) satisfying 0=θ1θ2ij(n+1)(kj)+βπj(kj(kj))kj(kj)+β(1δ)[θ1+θ2ij(n)¯(kj(kj))]0=-\theta_{1}-\theta_{2}i_{j}^{(n+1)}(k_{j})+\beta\frac{\partial\pi_{j}\left(k_{j}^{\prime}(k_{j})\right)}{\partial k_{j}^{\prime}(k_{j})}+\beta(1-\delta)\left[\theta_{1}+\theta_{2}\overline{i_{j}^{(n)}}(k_{j}^{\prime}(k_{j}))\right],

    where kj(kj)=(1δ)kj+ij(n)(kj)k_{j}^{\prime}(k_{j})=(1-\delta)k_{j}+i_{j}^{(n)}(k_{j}), and ij(n)¯\overline{i_{j}^{(n)}} denotes the interpolated value of iji_{j} using the values of ij(n)(k)i_{j}^{(n)}(k).

However, for dynamic games, the last term of equation (16), β2kj′′kjVj(kj′′,kj′′)kj′′\beta^{2}\frac{\partial k_{-j}^{\prime\prime}}{\partial k_{j}^{\prime}}\frac{\partial V_{j}(k_{j}^{\prime\prime},k_{-j}^{\prime\prime})}{\partial k_{-j}^{\prime\prime}}, remains. The term corresponds to the effect of strategic interactions under the Markov perfect equilibrium, and the term cannot be directly computed without introducing VjV_{j}. This implies the EE method is not directly applicable to dynamic games.

6.1.3 Endogenous Gridpoint Method (EGM; Carroll,, 2006)

The idea behind EGM is to solve the problem backward rather than forward. If we directly applied the EGM to the dynamic investment competition model, the algorithm would be as follows:

  1. 1.

    Take grid points k^(kj^)j𝒥𝒮E^\widehat{k^{\prime}}\equiv\left(\widehat{k_{j}^{\prime}}\right)_{j\in\mathcal{J}}\in\widehat{\mathcal{S}_{E}}. Set initial values of (Vj(0)(k(k^)))j𝒥,k^𝒮E^\left(V_{j}^{(0)}\left(k\left(\widehat{k^{\prime}}\right)\right)\right)_{j\in\mathcal{J},\widehat{k^{\prime}}\in\widehat{\mathcal{S}_{E}}}. Iterate the following until convergence (n=0,1,,)(n=0,1,\cdots,):

  2. 2.

    Given V(n)V^{(n)},

    1. (a)

      Analytically solve for ij(n+1)(k^)i_{j}^{(n+1)}\left(\widehat{k^{\prime}}\right) satisfying θ1θ2ij(n+1)(k^)+βVj(kj^,kj^)kj^=0-\theta_{1}-\theta_{2}i_{j}^{(n+1)}\left(\widehat{k^{\prime}}\right)+\beta\frac{\partial V_{j}\left(\widehat{k_{-j}^{\prime}},\widehat{k_{-j}^{\prime}}\right)}{\partial\widehat{k_{j}^{\prime}}}=0

    2. (b)

      Compute endogenous grid point kj(n+1)(k^)=kj^ij(n+1)(k^)1δk_{j}^{(n+1)}\left(\widehat{k^{\prime}}\right)=\frac{\widehat{k_{j}^{\prime}}-i_{j}^{(n+1)}\left(\widehat{k^{\prime}}\right)}{1-\delta}

    3. (c)

      Update Vj(k(k^))V_{j}\left(k\left(\widehat{k^{\prime}}\right)\right) by Vj(n+1)(k(k^))=πj(k(k^))c(ij(n+1)(k^))+βVj(n)(k^)V_{j}^{(n+1)}\left(k\left(\widehat{k^{\prime}}\right)\right)=\pi_{j}\left(k\left(\widehat{k^{\prime}}\right)\right)-c\left(i_{j}^{(n+1)}\left(\widehat{k^{\prime}}\right)\right)+\beta V_{j}^{(n)}\left(\widehat{k^{\prime}}\right)

Here, suppose that two equilibria i(1)(k)(ij(1)(k))j𝒥i^{(1)}\left(k^{*}\right)\equiv\left(i_{j}^{(1)}\left(k^{*}\right)\right)_{j\in\mathcal{J}} and i(2)(k)(ij(2)(k))j𝒥i^{(2)}\left(k^{*}\right)\equiv\left(i_{j}^{(2)}\left(k^{*}\right)\right)_{j\in\mathcal{J}}exist at a state kk^{*}. Let kj(m=1,2)=(1δ)ij(m=1,2)+kjk_{j}^{\prime(m=1,2)}=(1-\delta)i_{j}^{(m=1,2)}+k_{j}^{*} be the associated state of firm jj in the next period. We consider the case where we choose k(1)(kj(1))j𝒥k^{\prime(1)}\equiv\left(k_{j}^{\prime(1)}\right)_{j\in\mathcal{J}} and k(2)(kj(2))j𝒥k^{\prime(2)}\equiv\left(k_{j}^{\prime(2)}\right)_{j\in\mathcal{J}} as grid points. Then, if we use the first endogenous grid point, we evaluate Vj(k)V_{j}(k^{*}) as πj(k)c(ij(1)(k))+βVj(k(1))\pi_{j}(k^{*})-c\left(i_{j}^{(1)}(k^{*})\right)+\beta V_{j}\left(k^{\prime(1)}\right). If we use the second endogenous grid point, we evaluate Vj(k)V_{j}\left(k^{*}\right) as πj(k)c(ij(2)(k))+βVj(k(2))\pi_{j}(k^{*})-c\left(i_{j}^{(2)}(k^{*})\right)+\beta V_{j}\left(k^{\prime(2)}\right). If we use both, we have two distinct values Vj(k)=πj(k)c(ij(1)(k))+βVj(k(1))V_{j}(k^{*})=\pi_{j}(k^{*})-c\left(i_{j}^{(1)}(k^{*})\right)+\beta V_{j}\left(k^{\prime(1)}\right) and Vj(k)=πj(k)c(ij(2)(k))+βVj(k(2))V_{j}(k^{*})=\pi_{j}(k^{*})-c\left(i_{j}^{(2)}(k^{*})\right)+\beta V_{j}\left(k^{\prime(2)}\right), which are hard to handle in the algorithm above.

6.2 Approach using cubic splines

Goettler and Gordon, (2011) and others191919Goettler and Gordon, (2011) solved a dynamic oligopoly model of durable goods firms introducing dynamic demand structure and firms’ continuous investment decisions. use cubic splines to approximate the shape of the action value functions QQ with respect to aa. Because cubic spline functions are cubic functions, their derivatives are quadratic functions, which implies that the counterpart of Qa=0\frac{\partial Q}{\partial a}=0 are quadratic equations and their analytical solutions exist. Hence, analysts need not solve nonlinear equations numerically. However, the introduction of cubic splines needs the use of many grid points to evaluate the values of QQ for good approximation, thereby requiring many function evaluations.

6.3 Projection method and deep learning-based methods

The projection method (e.g., Judd,, 1992) parameterizes variables as functions of states, such as value functions and actions, and solves for the parameters that satisfy the nonlinear equations or minimize the residuals of the nonlinear equations. The method has recently been extended by introducing deep learning techniques (e.g., Azinovic et al.,, 2022). As discussed in Section 3.3, in the VF-PGI method, we consider Bellman equations and optimality conditions concerning actions as nonlinear equations to be solved, and the proposed method resembles projection-based methods. Moreover, actions are updated in the directions of gradients in the proposed algorithm, and deep learning techniques also rely on gradient descent type algorithms. Note that our proposed algorithm might be more intuitive than the projection-based algorithms, because we can provide an intuitive explanation of the iteration process until convergence, as discussed in Section 3.

6.4 Reinforcement learning

The concept of using the gradient of “long-term reward” concerning actions for updating actions is similar to the Deterministic Policy Gradient method (DPG; Silver et al.,, 2014), which has been widely applied in the literature of reinforcement learning.202020The DPG method utilizing deep learning techniques is known as DDPG (Lillicrap et al.,, 2015). DDPG was extended to the multi-agent deterministic policy gradient method (MADDPG) by Lowe et al., (2017). However, the settings and ideas are slightly different, and I clarify the point here by considering a single-agent dynamic model.

Unlike the dynamic model examined in the current study, DPG is developed in model-free settings, where agents do not know the parametric forms of per-period profits and state transitions. DPG approximates the action value function Q(a,s)Q\left(a,s\right) by Q(a,s)Q~(a,s;θQ)Q\left(a,s\right)\approx\widetilde{Q}\left(a,s;\theta_{Q}\right) and action a(s)a(s) by a(s)a~(s;θa)a(s)\approx\widetilde{a}\left(s;\theta_{a}\right). The method iterates the joint updates of the parameters θQ\theta_{Q} and θa\theta_{a} until convergence. Regarding θa\theta_{a}, the parameter is updated by θa(n+1)=θa(n)+λaθaa~(s;θa)aQ~(s,a)|a=a~(s;θa)\theta_{a}^{(n+1)}=\theta_{a}^{(n)}+\lambda_{a}\nabla_{\theta_{a}}\widetilde{a}\left(s;\theta_{a}\right)\nabla_{a}\widetilde{Q}\left(s,a\right)|_{a=\widetilde{a}\left(s;\theta_{a}\right)}, where λa\lambda_{a} denotes the learning rate.212121See Silver et al., (2014) for details. The updating equation implies that the values of θa\theta_{a} are updated so that they step in the direction of the gradient of the action value function QQ, and the idea is analogous to the VF-PGI algorithm. Note that VF-PGI is more intuitive because it directly updates actions rather than the parameters related to the actions.

7 Conclusions

This study proposes the Value Function-Policy Gradient Iteration-Spectral (VF-PGI-Spectral) algorithm for computationally efficiently solving dynamic models with continuous actions. The VF-PGI-Spectral algorithm is general, in that it is not limited to models with specific functional forms, and it is applicable to single-agent/multi-agent dynamic models with single/multiple continuous actions. The algorithm is intuitive, easy to code, and substantially faster than the traditional value function iteration algorithm in some settings.

Although the VF-PGI-Spectral algorithm performs numerically well in this study, more theoretical investigation of the convergence properties of the algorithm would be interesting.

Furthermore, although the current study has employed the spectral algorithm to accelerate the convergence and it works well, it is unclear whether the spectral algorithm is the best in terms of convergence speed. Previous studies in operation research have investigated other methods to accelerate value function iterations and policy iterations, such as Anderson acceleration and the idea of momentum.222222For recent studies, see the literature review in Grand-Clément, (2021). Though they have implicitly considered dynamic optimization problems with discrete actions, their ideas would also provide insights into the dynamic models with continuous actions. I leave the investigation of these alternative acceleration methods for further research.

Appendix A Proof

A.1 Proof of Proposition 1

Proof.

(Proof of \Rightarrow)

  • If fi(x)0f_{i}(x^{*})\leq 0 and xi=lix_{i}^{*}=l_{i}, Φ^i(x)xi=lili=0\widehat{\Phi}_{i}(x^{*})-x_{i}^{*}=l_{i}-l_{i}=0.

  • If fi(x)0f_{i}(x^{*})\geq 0 and xi=uix_{i}^{*}=u_{i}, Φ^i(x)xi=uiui=0\widehat{\Phi}_{i}(x^{*})-x_{i}^{*}=u_{i}-u_{i}=0.

  • If xi(li,ui)x_{i}^{*}\in(l_{i},u_{i}), fi(x)=0f_{i}(x^{*})=0 implies xi=Φi(x)x_{i}^{*}=\Phi_{i}(x^{*}). Hence, Φ^i(x)xi=PBi(Φi(x))xi=PBi(xi)xi=0\widehat{\Phi}_{i}(x)-x_{i}^{*}=P_{B_{i}}\left(\Phi_{i}(x^{*})\right)-x_{i}^{*}=P_{B_{i}}\left(x_{i}^{*}\right)-x_{i}^{*}=0.

Note that neither (fi(x)>0&xi=li)\left(f_{i}(x^{*})>0\ \&\ x_{i}^{*}=l_{i}\right) nor (fi(x)<0&xi=ui)\left(f_{i}(x^{*})<0\ \&\ x_{i}^{*}=u_{i}\right) does not hold under (10).

Therefore, \Rightarrow holds.

(Proof of \Leftarrow)

  • Case of xi=lix_{i}^{*}=l_{i}:

    Suppose fi(x)>0f_{i}(x)>0. Then, by the definition of Φ^\widehat{\Phi}, li=xi=PBi(Φi(x))l_{i}=x_{i}^{*}=P_{B_{i}}\left(\Phi_{i}(x^{*})\right) holds. Since liBil_{i}\in B_{i}, li=PBi(Φi(x))=Φi(x)l_{i}=P_{B_{i}}\left(\Phi_{i}(x^{*})\right)=\Phi_{i}(x^{*}) holds. It implies xi=Φi(x)x_{i}^{*}=\Phi_{i}(x^{*}), and fi(x)=0f_{i}(x^{*})=0 holds by (11). This is a contradiction. Hence, fi(x)0f_{i}(x)\leq 0 holds.

  • Case of xi=uix_{i}^{*}=u_{i}: We can show fi(x)0f_{i}(x)\geq 0 as in the case of xi=lix_{i}^{*}=l_{i}.

  • Case of xi(li,ui)x_{i}^{*}\in(l_{i},u_{i}): Under xi(li,ui)x_{i}^{*}\in(l_{i},u_{i}), xi=Φ^i(x)=PBi(Φi(x))=Φi(x)x_{i}^{*}=\widehat{\Phi}_{i}(x^{*})=P_{B_{i}}\left(\Phi_{i}(x^{*})\right)=\Phi_{i}(x^{*}) holds. Hence, by (11), fi(x)=0f_{i}(x^{*})=0 holds.

A.2 Proof of Proposition 2

Proof.

It is sufficient to show the following.

  • limxili+0,fi(x)0Φ^i(x)=li\lim_{x_{i}\rightarrow l_{i}+0,f_{i}(x)\leq 0}\widehat{\Phi}_{i}(x)=l_{i}:

    limxili+0,fi(x)0Φ^i(x)\displaystyle\lim_{x_{i}\rightarrow l_{i}+0,f_{i}(x)\leq 0}\widehat{\Phi}_{i}(x) =\displaystyle= limxili+0,fi(x)0PBi(Φi(x))\displaystyle\lim_{x_{i}\rightarrow l_{i}+0,f_{i}(x)\leq 0}P_{B_{i}}\left(\Phi_{i}(x)\right)
    =\displaystyle= PBi(limxili+0,fi(x)0Φi(x))\displaystyle P_{B_{i}}\left(\lim_{x_{i}\rightarrow l_{i}+0,f_{i}(x)\leq 0}\Phi_{i}(x)\right)
    =\displaystyle= PBi(li)=li\displaystyle P_{B_{i}}(l_{i})=l_{i}
  • limxixis.t.fi(li,xi)>0,fi(li,xi)=0Φ^i(li,xi)=li\lim_{x_{-i}\rightarrow x_{-i}^{*}\ s.t.\ f_{i}(l_{i},x_{-i})>0,\ f_{i}(l_{i},x_{-i}^{*})=0}\widehat{\Phi}_{i}(l_{i},x_{-i})=l_{i}:

    Let xix_{-i}^{*} be such that fi(li,xi)=0f_{i}(l_{i},x_{-i}^{*})=0. Because Φi(li,xi)\Phi_{i}(l_{i},x_{-i}) is continuous with respect to xix_{-i}, for any ϵ>0\epsilon>0, there exists δ>0\delta>0 such that |xixi|<δ|Φi(li,xi)Φi(li,xi)|<ϵ|x_{-i}-x_{-i}^{*}|<\delta\Rightarrow\left|\Phi_{i}(l_{i},x_{-i})-\Phi_{i}(l_{i},x_{-i}^{*})\right|<\epsilon. Then, for xix_{-i} such that fi(li,xi)>0f_{i}(l_{i},x_{-i})>0,

    |Φi^(li,xi)li|\displaystyle\left|\widehat{\Phi_{i}}(l_{i},x_{-i})-l_{i}\right| =\displaystyle= |PBi(Φi(li,xi))li|\displaystyle\left|P_{B_{i}}\left(\Phi_{i}(l_{i},x_{-i})\right)-l_{i}\right|
    \displaystyle\leq |Φi(li,xi)li|(PBi(Φi(li,xi))li)\displaystyle\left|\Phi_{i}(l_{i},x_{-i})-l_{i}\right|\ \left(\because P_{B_{i}}\left(\Phi_{i}(l_{i},x_{-i})\right)\geq l_{i}\right)
    =\displaystyle= |Φi(li,xi)Φi(li,xi)|\displaystyle\left|\Phi_{i}(l_{i},x_{-i})-\Phi_{i}(l_{i},x_{-i}^{*})\right|
    \displaystyle\leq ϵror|xixi|<δ\displaystyle\epsilon\ \text{ror}\ |x_{-i}-x_{-i}^{*}|<\delta
  • limxixis.t.fi(ui,xi)<0,fi(ui,xi)=0Φ^i(ui,xi)=ui\lim_{x_{-i}\rightarrow x_{-i}^{*}\ s.t.\ f_{i}(u_{i},x_{-i})<0,\ f_{i}(u_{i},x_{-i}^{*})=0}\widehat{\Phi}_{i}(u_{i},x_{-i})=u_{i}: Similar to the case above.

A.3 Derivation of the envelope condition and Euler equation in the dynamic investment competition model with continuous states

In this section, we show equations (15) and (16).

Let kj(k)k_{j}^{\prime}(k) be the optimal next period capital stock of firm jj given the current state kk. By differentiating Vj(kj,kj)=πj(kj,kj)c(kj(k)(1δ)kj)+βVj(kj(k),kj(k))V_{j}(k_{j},k_{-j})=\pi_{j}(k_{j},k_{-j})-c\left(k_{j}^{\prime}(k)-(1-\delta)k_{j}\right)+\beta V_{j}\left(k_{j}^{\prime}(k),k_{-j}^{\prime}(k)\right) with respect to kjk_{j}, we have:

Vj(kj,kj)kj\displaystyle\frac{\partial V_{j}(k_{j},k_{-j})}{\partial k_{j}} =\displaystyle= πj(kj,kj)kj+(1δ)c(kj(1δ)kj)+\displaystyle\frac{\partial\pi_{j}(k_{j},k_{-j})}{\partial k_{j}}+(1-\delta)c^{\prime}\left(k_{j}^{\prime}-(1-\delta)k_{j}\right)+ (17)
kjkj(c(kj(1δ)kj)+βVj(kj,kj)kj)+βkjkjV(kj,kj)kj\displaystyle\frac{\partial k_{j}^{\prime}}{\partial k_{j}}\left(-c^{\prime}\left(k_{j}^{\prime}-(1-\delta)k_{j}\right)+\beta\frac{\partial V_{j}(k_{j}^{\prime},k_{-j}^{\prime})}{\partial k_{j}^{\prime}}\right)+\beta\frac{\partial k_{-j}^{\prime}}{\partial k_{j}}\frac{\partial V(k_{j}^{\prime},k_{-j}^{\prime})}{\partial k_{-j}^{\prime}}

By firm jj’s optimality condition regarding the choice of kjk_{j}^{\prime},

c(kj(1δ)kj)+βV(kj,kj)kj=0-c^{\prime}\left(k_{j}^{\prime}-(1-\delta)k_{j}\right)+\beta\frac{\partial V(k_{j}^{\prime},k_{-j}^{\prime})}{\partial k_{j}^{\prime}}=0 (18)

holds, and equations (17) and (18) imply

Vj(kj,kj)kj=πj(kj,kj)kj+(1δ)c(kj(1δ)kj)+βkjkjVj(kj,kj)kj,\frac{\partial V_{j}\left(k_{j},k_{-j}\right)}{\partial k_{j}}=\frac{\partial\pi_{j}\left(k_{j},k_{-j}\right)}{\partial k_{j}}+(1-\delta)c^{\prime}\left(k_{j}^{\prime}-(1-\delta)k_{j}\right)+\beta\frac{\partial k_{-j}^{\prime}}{\partial k_{j}}\frac{\partial V_{j}\left(k_{j}^{\prime},k_{-j}^{\prime}\right)}{\partial k_{-j}^{\prime}},

namely, equation (15).

Next, equations (15) and (18) imply

0\displaystyle 0 =\displaystyle= c(kj(1δ)kj)+βπj(kj,kj)kj+β(1δ)c(kj′′(1δ)kj)+β2kj′′kjVj(kj′′,kj′′)kj′′,\displaystyle-c^{\prime}\left(k_{j}^{\prime}-(1-\delta)k_{j}\right)+\beta\frac{\partial\pi_{j}(k_{j}^{\prime},k_{-j}^{\prime})}{\partial k_{j}^{\prime}}+\beta(1-\delta)c^{\prime}\left(k_{j}^{\prime\prime}-(1-\delta)k_{j}^{\prime}\right)+\beta^{2}\frac{\partial k_{-j}^{\prime\prime}}{\partial k_{j}^{\prime}}\frac{\partial V_{j}(k_{j}^{\prime\prime},k_{-j}^{\prime\prime})}{\partial k_{-j}^{\prime\prime}},

namely, equation (16).

Appendix B Details of the Numerical experiments

B.1 Details of the numerical experiments in Section 5.1

The code for the experiment is based on the replication code of Maliar and Maliar, (2013). As in Maliar and Maliar, (2013), we use a rectangular, uniformly spaced grid of 10 ×\times 10 points for capital kk and productivity zz within an ergodic range as a solution domain. Expectations concerning future exogenous variables are approximated by using a 3-node Gauss Hermite quadrature rule. We parameterize value functions with complete ordinary polynomials of degree 4. We assume the iteration converges when the unit-free norms of variables are less than 1E-6. For instance, we assume VFI converges when V(n+1)(s)V(n)(s)1\text{$\left\|\frac{V^{(n+1)}(s)}{V^{(n)}(s)}-1\right\|$}_{\infty}\leq1E-6.

As in Maliar and Maliar, (2013), we use z(1l¯)z(1-\overline{l}) as the initial labor function. Our initial guess on the consumption function is that a constant fraction πc\pi_{c} of the period output goes to consumption and the rest goes to investment. As initial value functions, I use values consistent with the labor and consumption functions.

Regarding the ECM and EGM algorithms, I show their steps in Algorithms 6 and 7.

  1. 1.

    Set grid points (k,z)(k,z) and initial V(k,z)V(k,z).

  2. 2.

    Iterate the following until convergence

    1. (a)

      Solve for ll satisfying Vk(k,z)=B(1l)μa(1α)kαlα[1δ+aαkα1l1α]V_{k}(k,z)=\frac{B(1-l)^{-\mu}}{a(1-\alpha)k^{\alpha}l^{-\alpha}}\left[1-\delta+a\alpha k^{\alpha-1}l^{1-\alpha}\right].

    2. (b)

      Compute c=(B(1l)μ(1α)Azkαlα)1γc=\left(\frac{B(1-l)^{-\mu}}{(1-\alpha)Azk^{\alpha}l^{-\alpha}}\right)^{\frac{-1}{\gamma}} and k=(1δ)k+zr(k,l)ck^{\prime}=(1-\delta)k+zr(k,l)-c

    3. (c)

      Update V(k,z)V(k,z) by V(k,z)u(c,l)+βE[V¯(k,z)]V(k,z)\leftarrow u(c,l)+\beta E\left[\overline{V}(k^{\prime},z^{\prime})\right]

Algorithm 6 ECM algorithm for the single-agent growth model with elastic labor supply
  1. 1.

    Set grid points (k,z)(k^{\prime},z) and initial V(k(k),z)V\left(k(k^{\prime}),z\right).

  2. 2.

    Iterate the following until convergence. Here, let W(k,z)Ez[V(k,z)|z]W\left(k^{\prime},z\right)\equiv E_{z^{\prime}}\left[V(k^{\prime},z^{\prime})|z\right].

    1. (a)

      Compute c=[βWk(k,z)]1/γc=\left[\beta W_{k^{\prime}}\left(k^{\prime},z\right)\right]^{-1/\gamma}

    2. (b)

      Solve for ll satisfying k=(1δ)(B(1l)μβWk(k,z)z(1α))1/αl+B(1l)μlβWk(k,z)(1α)ck^{\prime}=(1-\delta)\left(\frac{B(1-l)^{-\mu}}{\beta W_{k}(k^{\prime},z)z(1-\alpha)}\right)^{1/\alpha}l+\frac{B(1-l)^{-\mu}l}{\beta W_{k}(k^{\prime},z)(1-\alpha)}-c.

    3. (c)

      Compute c=(B(1l)μ(1α)Azkαlα)1γc=\left(\frac{B(1-l)^{-\mu}}{(1-\alpha)Azk^{\alpha}l^{-\alpha}}\right)^{\frac{-1}{\gamma}} and k=k+czr(k,l)1δk=\frac{k^{\prime}+c-zr(k,l)}{1-\delta}

    4. (d)

      Update V(k(k),z)V\left(k(k^{\prime}),z\right) by V(k(k),z)u(c,l)+βW(k,z)V\left(k(k^{\prime}),z\right)\leftarrow u(c,l)+\beta W(k^{\prime},z)

Algorithm 7 EGM algorithm for the single-agent growth model with elastic labor supply

B.2 Details of the numerical experiments in Section 5.2

I compute the expectations concerning future exogenous variables by using a 10-node Gauss Hermite quadrature rule. To ease the computation of the expectation, I apply the “precomputation of integrals” technique in Judd et al., (2017). The technique reduces the computational cost per iteration.

Regarding basis functions, I use Chebyshev polynomials. As a solution domain, I choose a rectangular grid. The range of each firm’s capital stock kjtk_{jt} is between [1/1.5,1×1.5]\left[1/1.5,1\times 1.5\right]. The range of the exogenous cost shock μ\mu is between [4/1.03,4×1.03]\left[4/1.03,4\times 1.03\right], and the range of the exogenous demand shock ξ\xi is between [2/1.03,2×1.03]\left[2/1.03,2\times 1.03\right]. To reduce the number of grid points without largely losing numerical accuracy, I apply the Smolyak method (Smolyak,, 1963; Judd et al.,, 2014). The approximation level, which is denoted as μ\mu in Judd et al., (2014), is set to 3. Note that the relative performance of the algorithms rarely changes even when not introducing such additional techniques.

As initial value functions, we use values assuming that exogenous states are constant over time and firms make constant investments so that the current capital stocks remain constant over time. As initial investment values, we use 0. The tolerance level for evaluating convergence is set to 1E-10.

When applying the VFI* algorithm, we need to solve nonlinear optimization problems for each firm and state. To solve the problem, I apply Newton’s method.

B.3 Details of the numerical experiments in Section 5.3

The code for the experiment is based on the replication code of Pakes and McGuire, (1994). As in Pakes and McGuire, (1994), we use 0 as the initial value functions and investment. To make the comparison easier, I modified the original code so that the steps are consistent with the VF-PGI* algorithm shown in Algorithm 4. Note that the modification rarely affected the performance of the algorithm. Tolerance level for evaluating convergence is set to 1E-10.

Besides, as some global variables make the iterations more computationally costly, I omitted these global variables and put them as arguments of functions. When solving nonlinear optimization problems in the absence of analytical formulas on solutions, I use the fminbnd function in MATLAB by setting the lower and upper bounds to be 0 and 1.

Appendix C Additional results

C.1 Single-agent neoclassical growth model with elastic supply

C.1.1 Effect of the value of λ\lambda on the performance of the VF-PGI algorithm

Table 6 shows the numerical results under the settings where we do not apply the spectral algorithm to VF-PGI. The results imply that the iteration does not diverge when we set λ108\lambda\leq 10^{-8}. Nevertheless, the convergence is very slow, and the iterations terminated before reaching the convergence criteria. In contrast, when λ107\lambda\geq 10^{-7}, the iterations diverge. Hence, though VF-PGI-Spectral works well, VF-PGI itself is not practical.

Table 6: Effect of the value of λ\lambda in the VF-PGI algorithm (Single agent neoclassical growth model with elastic labor supply)
Method log10(λ)\log_{10}(\lambda) CPU (sec) L1L_{1} LL_{\infty} Iter. Conv.
VF-PGI -10 3.495 -3.266 -2.394 3000 0
VF-PGI -9 3.925 -3.852 -3.256 3000 0
VF-PGI -8 5.322 -5.425 -3.983 3000 0
VF-PGI -7 0.046 NaN NaN 8 0
VF-PGI -6 0.015 NaN NaN 4 0
VF-PGI -5 0.021 NaN NaN 6 0

Notes. α0=1\alpha_{0}=1.

L1L_{1} and LL_{\infty} denotes the average and the maximum absolute values of Euler residuals in log 10 units on a stochastic simulation of 10,000 observations.

In Table 7, I show the results when we set λ=1\lambda=1. The results show that when we set α0\alpha_{0} to be small, iterations converge. Note that the value of α0\alpha_{0} does not largely affect the convergence speed as long as they converge. For relatively large values of α0\alpha_{0}, the iterations diverge.

Table 7: Effect of the value of α0\alpha_{0} in the VF-PGI-Spectral algorithm (Single agent neoclassical growth model with elastic labor supply)
Method log10(α0)\log_{10}(\alpha_{0}) CPU (sec) L1L_{1} LL_{\infty} Iter. Conv.
VF-PGI-Spectral -10 0.169 -5.425 -3.983 136 1
VF-PGI-Spectral -9 0.129 -5.425 -3.983 115 1
VF-PGI-Spectral -8 0.12 -5.425 -3.983 95 1
VF-PGI-Spectral -7 0.137 -5.425 -3.983 112 1
VF-PGI-Spectral -6 0.022 NaN NaN 14 0
VF-PGI-Spectral -5 0.008 NaN NaN 4 0

Notes. λ=1\lambda=1.

L1L_{1} and LL_{\infty} denotes the average and the maximum absolute values of Euler residuals in log 10 units on a stochastic simulation of 10,000 observations.

C.1.2 Alternative specification: Same αz(z{V,ad=1,,D})\alpha_{z}\ \left(z\in\left\{V,a^{d=1,\cdots,D}\right\}\right)

Table 8 shows the results under the alternative setting where we assume that αz(z{V,ad=1,,D})\alpha_{z}\ \left(z\in\left\{V,a^{d=1,\cdots,D}\right\}\right) are common. The results show that the iteration becomes unstable when choosing a common value.

Table 8: Effect of the choice of αz(z{V,ad=1,,D})\alpha_{z}\ \left(z\in\left\{V,a^{d=1,\cdots,D}\right\}\right) in the VF-PGI-Spectral algorithm (Single agent neoclassical growth model with elastic labor supply)
Method Same αz\alpha_{z} CPU (sec) L1L_{1} LL_{\infty} Iter. Conv.
VF-PGI-Spectral No 0.156 -5.425 -3.983 138 1
VF-PGI-Spectral Yes 0.033 NaN NaN 19 0

Notes. α0=1,λ=107\alpha_{0}=1,\lambda=10^{-7}.

L1L_{1} and LL_{\infty} denotes the average and the maximum absolute values of Euler residuals in log 10 units on a stochastic simulation of 10,000 observations.

C.2 Single-agent neoclassical growth model with inelastic labor supply

Settings

As in Arellano et al., (2016) and Coleman et al., (2021), we consider a standard single-agent neoclassical growth model with inelastic labor supply. The dynamic optimization problem is characterized by:

V(k,z)\displaystyle V(k,z) =\displaystyle= maxc,ku(c)+βE[V(k,z)]\displaystyle\max_{c,k^{\prime}}u(c)+\beta E\left[V(k^{\prime},z^{\prime})\right]
s.t.\displaystyle s.t. k=(1δ)k+zr(k)c\displaystyle k^{\prime}=(1-\delta)k+zr(k)-c
lnz=ρlnz+ϵ,ϵN(0,σ2)\displaystyle\ln z^{\prime}=\rho\ln z+\epsilon^{\prime},\ \epsilon^{\prime}\sim N(0,\sigma^{2})
u(c)=c1γ11γ\displaystyle u(c)=\frac{c^{1-\gamma}-1}{1-\gamma}
r(k)=Akα\displaystyle r(k)=Ak^{\alpha}

where k,c,zk,c,z are capital, consumption, and productivity level. We choose parameters A=1/β(1δ)α,α=1/3,β=0.99,δ=0.025,ρ=0.95,σ=0.01A=\frac{1/\beta-(1-\delta)}{\alpha},\alpha=1/3,\beta=0.99,\delta=0.025,\rho=0.95,\sigma=0.01. The experiments are run by modifying the replication code of Coleman et al., (2021).232323Coleman et al., (2021)’s code is based on the one in Arellano et al., (2016).

Results

Table 9 shows the results. For details of the 7 methods without the spectral algorithm, see Arellano et al., (2016). The results show that VF-PGI-Spectral works well, but ECM-VF-Spectral, ECM-DVF-Spectral, and EE-Spectral work better. In these three methods, we can totally avoid solving nonlinear problems, and we have to update only one variable (value function or action). In contrast, in VF-PGI-Spectral, we have to update two types of variables (value function and action) separately, and it might lead to slower convergence. Hence, when researchers can find good analytical representations to avoid solving nonlinear problems inside iterations, these methods can be good alternatives to VF-PGI-Spectral.

Table 9: Speed and accuracy of algorithms (Single agent neoclassical growth model with inelastic labor supply)
Method CPU (sec) L1L_{1} LL_{\infty} Iter.
VF-PGI-Spectral 0.121 -6.089 -4.099 92
ECM-VF-Spectral 0.060 -6.064 -4.056 59
VFI-Spectral 4.921 -6.089 -4.099 66
EGM-Spectral 2.151 -6.091 -4.124 40
ECM-PI-Spectral 0.754 -5.465 -3.944 5
PI-Spectral 1.202 -5.396 -3.689 5
ECM-DVF-Spectral 0.046 -6.071 -4.137 31
EE-Spectral 0.076 -7.79 -5.936 48
ECM-VF 0.831 -6.064 -4.056 1072
VFI 106.812 -6.089 -4.099 1072
EGM 57.741 -6.091 -4.124 768
ECM-PI 0.384 -5.938 -4.044 5
PI 0.898 -5.952 -4.086 5
ECM-DVF 0.385 -6.071 -4.137 292
EE 0.445 -7.784 -5.934 324

Notes. α0=1,λ=0.001\alpha_{0}=1,\lambda=0.001.

L1L_{1} and LL_{\infty} denotes the average and the maximum absolute values of Euler residuals in log 10 units on a stochastic simulation of 10,000 observations.

The results also show that combining the spectral algorithm does not lead to a large speed-up for PI (policy iteration)-type algorithms. As formally discussed in Puterman and Brumelle, (1979) and Santos and Rust, (2004), policy iteration is mathematically equivalent to applying Newton’s method for solving the nonlinear equation (Bellman equation). Because the role of the spectral algorithm is similar to Newton’s method, the additional role of introducing the spectral algorithm might not be very large for policy iteration-type algorithms.

References

  • Aguirregabiria and Marcoux, (2021) Aguirregabiria, V. and Marcoux, M. (2021). Imposing equilibrium restrictions in the estimation of dynamic discrete games. Quantitative Economics, 12(4):1223–1271.
  • Arellano et al., (2016) Arellano, C., Maliar, L., Maliar, S., and Tsyrennikov, V. (2016). Envelope condition method with an application to default risk models. Journal of Economic Dynamics and Control, 69:436–459.
  • Azinovic et al., (2022) Azinovic, M., Gaegauf, L., and Scheidegger, S. (2022). Deep equilibrium nets. International Economic Review, 63(4):1471–1525.
  • Barzilai and Borwein, (1988) Barzilai, J. and Borwein, J. M. (1988). Two-point step size gradient methods. IMA journal of numerical analysis, 8(1):141–148.
  • Berry et al., (1995) Berry, S., Levinsohn, J., and Pakes, A. (1995). Automobile prices in market equilibrium. Econometrica, 63(4):841–890.
  • Carroll, (2006) Carroll, C. D. (2006). The method of endogenous gridpoints for solving dynamic stochastic optimization problems. Economics Letters, 91(3):312–320.
  • Coleman et al., (2021) Coleman, C., Lyon, S., Maliar, L., and Maliar, S. (2021). Matlab, python, julia: What to choose in economics? Computational Economics, 58:1263–1288.
  • Conlon and Gortmaker, (2020) Conlon, C. and Gortmaker, J. (2020). Best practices for differentiated products demand estimation with PyBLP. The RAND Journal of Economics, 51(4):1108–1161.
  • Cooper and Haltiwanger, (2006) Cooper, R. W. and Haltiwanger, J. C. (2006). On the nature of capital adjustment costs. The Review of Economic Studies, 73(3):611–633.
  • Fukasawa, (2024) Fukasawa, T. (2024). Fast and simple inner-loop algorithms of static/dynamic BLP estimations. arXiv preprint arXiv:2404.04494.
  • Fukasawa and Ohashi, (2023) Fukasawa, T. and Ohashi, H. (2023). Long-run Effect of a Horizontal Merger and Its Remedial Standards. Technical report, Research Institute of Economy, Trade and Industry (RIETI).
  • Goettler and Gordon, (2011) Goettler, R. L. and Gordon, B. R. (2011). Does AMD spur intel to innovate more? Journal of Political Economy, 119(6):1141–1200.
  • Grand-Clément, (2021) Grand-Clément, J. (2021). From convex optimization to MDPs: A review of first-order, second-order and quasi-Newton methods for MDPs. arXiv preprint arXiv:2104.10677.
  • Iskhakov, (2015) Iskhakov, F. (2015). Multidimensional endogenous gridpoint method: Solving triangular dynamic stochastic optimization problems without root-finding operations. Economics Letters, 135:72–76.
  • Judd, (1992) Judd, K. L. (1992). Projection methods for solving aggregate growth models. Journal of Economic theory, 58(2):410–452.
  • Judd, (1998) Judd, K. L. (1998). Numerical methods in economics. MIT press.
  • Judd et al., (2017) Judd, K. L., Maliar, L., Maliar, S., and Tsener, I. (2017). How to solve dynamic stochastic models computing expectations just once. Quantitative Economics, 8(3):851–893.
  • Judd et al., (2014) Judd, K. L., Maliar, L., Maliar, S., and Valero, R. (2014). Smolyak method for solving dynamic economic models: Lagrange interpolation, anisotropic grid and adaptive domain. Journal of Economic Dynamics and Control, 44:92–123.
  • La Cruz et al., (2006) La Cruz, W., Martínez, J., and Raydan, M. (2006). Spectral residual method without gradient information for solving large-scale nonlinear systems of equations. Mathematics of Computation, 75(255):1429–1448.
  • Lillicrap et al., (2015) Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.
  • Lowe et al., (2017) Lowe, R., Wu, Y. I., Tamar, A., Harb, J., Pieter Abbeel, O., and Mordatch, I. (2017). Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems, 30.
  • Maliar and Maliar, (2013) Maliar, L. and Maliar, S. (2013). Envelope condition method versus endogenous grid method for solving dynamic programming problems. Economics Letters, 120(2):262–266.
  • Mermelstein et al., (2020) Mermelstein, B., Nocke, V., Satterthwaite, M. A., and Whinston, M. D. (2020). Internal versus external growth in industries with scale economies: A computational model of optimal merger policy. Journal of Political Economy, 128(1):301–341.
  • Miranda and Fackler, (2004) Miranda, M. J. and Fackler, P. L. (2004). Applied computational economics and finance. MIT press.
  • Pakes et al., (1993) Pakes, A., Gowrisankaran, G., and McGuire, P. (1993). Implementing the Pakes-McGuire algorithm for computing Markov perfect equilibria in Gauss. Technical report, Working paper, Yale University, New Haven.
  • Pakes and McGuire, (1994) Pakes, A. and McGuire, P. (1994). Computing Markov-Perfect Nash Equilibria: Numerical Implications of a Dynamic Differentiated Product Model. The RAND Journal of Economics, 25(4):555–589.
  • Puterman and Brumelle, (1979) Puterman, M. L. and Brumelle, S. L. (1979). On the convergence of policy iteration in stationary dynamic programming. Mathematics of Operations Research, 4(1):60–69.
  • Reynaerts et al., (2012) Reynaerts, J., Varadha, R., and Nash, J. C. (2012). Enhancing the convergence properties of the BLP (1995) contraction mapping.
  • Santos and Rust, (2004) Santos, M. S. and Rust, J. (2004). Convergence properties of policy iteration. SIAM Journal on Control and Optimization, 42(6):2094–2115.
  • Silver et al., (2014) Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., and Riedmiller, M. (2014). Deterministic policy gradient algorithms. In International conference on machine learning, pages 387–395. Pmlr.
  • Smolyak, (1963) Smolyak, S. A. (1963). Quadrature and interpolation formulas for tensor products of certain classes of functions. In Doklady Akademii Nauk, volume 148, pages 1042–1045. Russian Academy of Sciences.
  • Varadhan and Roland, (2008) Varadhan, R. and Roland, C. (2008). Simple and globally convergent methods for accelerating the convergence of any EM algorithm. Scandinavian Journal of Statistics, 35(2):335–353.