This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Delegated portfolio management with random default

Alberto Gennaro Department of Industrial Engineering and Operations Research, UC Berkeley, USA.
Email: [email protected]
   Thibaut Mastrolia Department of Industrial Engineering and Operations Research, UC Berkeley, USA.
Email: [email protected]
Abstract

We are considering the problem of optimal portfolio delegation between an investor and a portfolio manager under a random default time. We focus on a novel variation of the Principal-Agent problem adapted to this framework. We address the challenge of an uncertain investment horizon caused by an exogenous random default time, after which neither the agent nor the principal can access the market. This uncertainty introduces significant complexities in analyzing the problem, requiring distinct mathematical approaches for two cases: when the random default time falls within the initial time frame [0,T][0,T] and when it extends beyond this period. We develop a theoretical framework to model the stochastic dynamics of the investment process, incorporating the random default time. We then analyze the portfolio manager’s investment decisions and compensation mechanisms for both scenarios. In the first case, where the default time could be unbounded, we apply traditional results from Backward Stochastic Differential Equations (BSDEs) and control theory to address the agent problem. In the second case, where the default time is within the interval [0,T][0,T], the problem becomes more intricate due to the degeneracy of the BSDE’s driver. For both scenarios, we demonstrate that the contracting problem can be resolved by examining the existence of solutions to integro-partial Hamilton-Jacobi-Bellman (HJB) equations in both situations. We develop a deep-learning algorithm to solve the problem in high-dimension with no access to the optimizer of the Hamiltonian function.

Keywords: Stochastic control with random horizon, Principal-Agent problem, enlargement of filtration, BSDE, HJB equation and deep learning.

1 Introduction

Delegating portfolio management from an investor to a professional fund manager is increasingly seen as a strategic move due to the growing complexity of financial markets and their fragmentation [28, 7, 45]. The financial landscape today is marked by rapid market fluctuations, changing regulations, and an extensive range of investment opportunities, all of which necessitate not only significant time and effort but also in-depth knowledge and experience to navigate effectively. For many investors, handing over portfolio management to a professional allows them to leverage the fund manager’s expertise in areas such as professional oversight, diversification strategies, and sophisticated risk management techniques—capabilities that are often difficult to achieve independently.

In this context, fund managers are expected to deliver superior performance by utilizing their specialized skills and tools. Traditionally, the compensation structure for fund managers has been a blend of a fixed fee and a performance-based component. The fixed fee provides a stable income for the manager, while the performance-based component is designed to incentivize the manager to achieve better returns by aligning their interests with those of the investor. However, this conventional compensation structure raises an important question: Is it optimally designed to align the incentives of both the investor and the fund manager? The challenge lies in ensuring that the performance-based component effectively motivates the fund manager to act in the best interests of the investor, while also accounting for the inherent uncertainties and risks associated with market fluctuations. To address this, it is crucial to evaluate whether the existing compensation structures are adequately aligned with the investor’s objectives and whether alternative models could offer better alignment. This involves exploring various compensation schemes and their impact on fund performance, risk management, and overall investor satisfaction. Ultimately, a well-designed compensation structure should not only incentivize fund managers to maximize returns but also to manage risks prudently, ensuring that both the investor’s and the manager’s goals are harmoniously aligned in the pursuit of financial success.

From a mathematical perspective, this issue can be framed as a variation of the principal-agent (PA) problem in wealth management, see for example [16, 36, 45, 17, 33, 32, 14]. The PA framework in continuous time is a game-theoretical model designed to address problems of stochastic control, where one party (the principal) delegates decision-making authority to another party (the agent), whose actions are not directly observable and evolve in a stochastic environment. The agent controls the system by choosing a strategy that influences outcomes, but because of the information asymmetry, the principal cannot directly observe the agent’s actions. Instead, the principal must design a contract based on the observed outcomes, which indirectly depend on the agent’s actions. The goal is to structure this contract in a way that motivates the agent to exert optimal effort, while managing the inherent trade-offs between risk-sharing and incentives. Here, the investor (the principal) has an initial capital, X0=xX_{0}=x, and seeks an agent to invest on their behalf. The principal is willing to negotiate a compensation scheme that incentivizes the agent based on portfolio performance and risks. In the literature, the PA problem was extensively studied in a simpler setting: in the seminal work of [22], the agents control the drift of the process, but the utility is just drawn from the terminal value of the controlled process. In [43] we still see a control based on just the drift, but the utility is now directly dependent on inter-temporal payments. In both cases, because of the modeling choice of having a single Brownian motion, there is no moral hazard with respect to the control on the volatility. A concept introduced by [14], moral hazard in the volatility control is important when multiple sources of risks arise in the control problem. What the authors proved in this work, thanks to the mathematical advancement on singular change of measure, is that in optimal contracts there are some incentives to be given with respect to the quadratic variation of the controlled process and the co-variation with respect to the risk factors. Furthermore, in their work, instead of working from a probability perspective, they cast the problem as a stochastic control problem and they restrict the analysis to an admissible family of contracts, proving no loss of generality in doing so, paving a simpler way to the recent literature on the subject. In a following paper [15], the same result was proven by using second order BSDE, and a recent work [11] simplified even more the theoretical guarantees for volatility controlled problems, showing the same results but using the standard theory of BSDE. To summarise, in these works they define a class of admissible contracts, they prove there is no loss of generality in consider such form and they find the optimal contract in this set. In practice they rely on the work of [38, 5, 19, 39, 31, 37] on existence of solutions for BSDEs in the Lipshitz and quadratic cases with or without jumping terms and random horizon. This will also be the building block for this work.

Portfolio optimization has been extensively studied in the literature, beginning with the pioneering work of [34]. More recent studies have explored continuous-time versions of the framework, as well as settings that incorporate jumps, as in [35]. The PA problem captures the scenario of a fund manager investing on behalf of a client. Despite its practical relevance, the existing literature has not fully addressed a crucial aspect of this problem: the randomness of the investment horizon. More often than not, an investment in financial markets does not have a precise duration, which is used as a reference for setting risk aversion for the investor. But time is crucial in control problems. Our contribution aims to fill this gap by examining how decision-making strategies—both the agent’s investment strategy and the principal’s compensation scheme—are affected by the introduction of default times, adding uncertainty to the investment horizon. Adding a default time makes the problem mathematically more challenging. First of all, using a general default time forces to delve into the depth of information theory. To be able to treat the problem, we will need to enlarge the filtration adding to the one generated by the financial market the filtration resulting from the random default (see [27, 8, 1, 21] and the references therein). Furthermore, as highlighted by [25], there are two distinct cases to consider for default times, each with different implications:

  • Unbounded Case: If the maximum possible default time SS exceeds the investment horizon TT (or is infinite), it is uncertain whether a default will occur within the investment period, that is S=+S=+\infty. In this case, the investing problem is reduced to a utility maximization under random horizon. It has been solved in for example [29] proving that the solution is related to a system of BSDEs with a jump admitting a solution via a decomposition approach coming from filtration enlargement theory.

  • Bounded Case: If SS is less than TT, default will certainly occur before the investment horizon ends, but the exact timing is unknown. For simplicity, we assume S=TS=T. It has been solved in for example [26, 25] proving that the solution is related to a system of degenerated BSDEs with a jump.

These two cases not only have different interpretations but also require distinct mathematical tools. The unbounded case can be seen as a default caused by a black swan event [40], or a crash that forces authorities to close to market (Flash crash in May 2010, see [30]), or, in the blockchain, a hacker attack or, finally, in a more structured deal with investors, when a fund sees its money withdrawn with no or little notice. This often complicates the investment strategies for the funds, and for this reason certain funds (i.e. hedge fund) have very strict policies on funding withdrawal. Mathematically, the BSDE related to this case is better behaved than the other case. The bounded case instead can be representative of the well known and studied life insurance market: in this case, the insurance policy can have a time horizon of over 100100 years, so that we can claim with probability one, the investor is going to pass away before the natural termination of the contract. This means that, when calibrating the contract, both the agent and the investor are aware of the fact the horizon will not be respected and this is taken into account in both the trading strategies and the insurance payments. Mathematically, this formulation introduces a difficulty in the family of proposed contracts, as it generates a BSDE with a singular driver (see [25] and references therein), and it also poses some extra difficulties in the convergence of numerical methods.

The problem’s structure also presents challenges for numerical solutions. The partial differential equation (PDE) resulting from the Hamilton-Jacobi-Bellman (HJB) control problem has a varying coefficient dependent on the solution of a maximization problem involving the solution itself. Addressing this requires a specialized approach using an ”actor-critic” iterative algorithm, where the actor solves the PDE for a fixed coefficient, and the critic updates the maximization problem based on the actor’s latest guess. While several schemes could tackle this iterative process, the most effective have proven to be in the domain of Physics-Informed Neural Networks (PINNs). PINNs are a powerful machine learning framework that blends neural networks with principles from physics to solve complex differential equations, especially in situations where traditional methods may struggle. From a technical perspective, the surge of this methodology was possible because of the one of the most useful but perhaps underused techniques in scientific computing, automatic differentiation. We refer to the survey [6] for a comprehensive study of this problem. The simple idea behind it is to differentiate neural networks with respect to both their input coordinates and model parameters: the former allows us to have derivatives in the loss function, the latter is the standard way to train the network. Introduced and expanded by works such as [42] and [44], PINNs leverage the underlying physical laws, typically encoded as partial differential equations (PDEs), to guide the learning process. Rather than relying purely on data, PINNs incorporate these governing equations into the loss function, ensuring that the neural network solutions respect known physical constraints. This approach is particularly effective for solving high-dimensional partial (integro) differential equations, arising from (stochastic and continuous-time) problems in virtually every field, like fluid dynamics, electromagnetism, biology or finance (see the work of [3], [4]). By incorporating physics directly into the architecture, PINNs enable the modeling of complex systems while reducing reliance on large datasets, bridging the gap between traditional numerical solvers and modern machine learning techniques. Despite all these difficulties, this default-time formulation is crucial for practical applications, as they make both contract incentives and trading strategies more robust. This study contributes significantly to the literature on principal-agent problems, extending its applicability to real-world financial scenarios by providing insights into the effects of random investment horizons and default times.

The structure of this paper is organized as follows. In Section 2, we present the mathematical formulation of the Principal-Agent problem under time uncertainty, describing the underlying stochastic framework, the controlled wealth process governing the system dynamics, and all the necessary assumption to make the problem tractable. We will further define some classes of admissible contracts, and the optimisation problems for both the principal and the agent. In Section 3, we solve the problem sequentially, first focusing on the agent’s optimal strategy, which is going to be the same for both cases of bounded and unbounded default. Despite the fundamental difference in the proof of existence, this trading strategy has the same form in both cases and will be plugged into the principal’s problem. Then we will derive the Hamilton-Jacobi-Bellman (HJB) equation for the principal and we will claim, with a verification theorem, the existence of the solution to the Partial Differential Equation (PDE) that encapsulates the principal’s optimization problem in the two proposed settings. Then, Section 3.2 provides numerical examples that demonstrate the implementation of the theoretical results in concrete scenarios, for both cases, using default time from the families of beta distribution and exponential distribution. The goal is multiple: to show the differences arising within the same case, but also across the two different cases and even drawing comparison with a case of no default. Finally, we also want to highlight the sub-optimal behaviour of a subset of optimal contracts that mimic real-world compensation schemes.

2 The model and the optimization problem

2.1 Risky assets and portfolio dynamics

We consider a financial market represented by a probability space (Ω,,)(\Omega,\mathcal{F},\mathbb{P}) endowed with a dd-dimensional Brownian motion denoted by WW and a finite horizon T>0T>0. We denote by 𝔽:=(t)t[0,T]\mathbb{F}:=(\mathcal{F}_{t})_{t\in[0,T]} the natural filtration of this Brownian motion. This market consists in mm risky assets with vector price StS_{t} at time tt with no risk-free rate. The risky assets have the followed the dynamic:

dStiSti=btidt+σtidWti=1:m,\frac{dS_{t}^{i}}{S_{t}^{i}}=b^{i}_{t}dt+\sigma^{i}_{t}dW_{t}\qquad\forall i=1:m,

where σi,bi\sigma^{i},b^{i} are respectively 1×d\mathbb{R}^{1\times d}-valued and \mathbb{R}-valued bounded 𝔽\mathbb{F}-predictable processes. We define the m×d\mathbb{R}^{m\times d}-valued covariance matrix σ\sigma by σi,j\sigma^{i,j} is the jjth component of σi\sigma^{i}. We assume that σσ\sigma\sigma^{\top} is an invertible matrix, that is σσ\sigma\sigma^{\top} is dt\mathbb{P}\otimes dt-a.e. elliptic. We define θt=σt(σtσt)1bt\theta_{t}=\sigma_{t}^{\top}(\sigma_{t}\sigma_{t}^{\top})^{-1}b_{t}. Let πt=(πti)t\pi_{t}=(\pi_{t}^{i})_{t} be a vector in 1×m\mathbb{R}^{1\times m} representing the fraction of money invested in every asset at time tt. We refer to it as the investment strategy of the portfolio manager. We set βt=πtσt\beta_{t}=\pi_{t}\sigma_{t}. Note that we can refer also to β\beta or π\pi as the investment strategy interchangeably up to the volatility factor σ\sigma. For every π\pi, one can define a probability measure π\mathbb{P}^{\pi} such that the dynamics of the value of portfolio starting with X0=xX_{0}=x is then given by111We refer to Appendix A in [4] for the rigorous formulation of the problem and the choice of the probability π\mathbb{P}^{\pi}.

Xt:=x+0tπsσs𝑑Ws+0tπsbs𝑑s,X_{t}:=x+\int_{0}^{t}\pi_{s}\sigma_{s}dW_{s}+\int_{0}^{t}\pi_{s}b_{s}ds,

or equivalently

Xt:=x+0tβs𝑑Ws+0tβsθs𝑑s.X_{t}:=x+\int_{0}^{t}\beta_{s}dW_{s}+\int_{0}^{t}\beta_{s}\theta_{s}ds.

2.2 Default time and enlargement of filtration

The default time is representing by a random variable τ\tau taking values in +\mathbb{R}^{+}. We define the default process by Ht:=𝟏τtH_{t}:=\mathbf{1}_{\tau\leq t}. Note that this process is not necessarily 𝔽\mathbb{F}-measurable since the default time is assume to be potentially exogenous to the system and so independent of XX. We thus enlarged the available information and so the filtration 𝔽\mathbb{F} taking into account the information generated by the default time occurrence.

Definition 1.

Let u:=σ(Hs,s[0,u])\mathcal{H}_{u}:=\sigma(H_{s},s\in[0,u]) be the σ\sigma-algebra generated by HH until time u0u\geq 0. Given a filtrated space (Ω,T,𝔽,)(\Omega,\mathcal{F}_{T},\mathbb{F},\mathbb{P}), the enlarged filtration

𝔾=(𝒢t)t[0,T],Gt=ϵ>0{t+ϵt+ϵ},\mathbb{G}=(\mathcal{G}_{t})_{t\in[0,T]},\qquad G_{t}=\underset{\epsilon>0}{\bigcap}\{\mathcal{F}_{t+\epsilon}\vee\mathcal{H}_{t+\epsilon}\},

is the smallest enlargement of 𝔽\mathbb{F} such that τ\tau is a 𝔾\mathbb{G}-stopping time.

Remark 1.

HtH_{t} is not 𝔽\mathbb{F}-measurable but it is 𝔾\mathbb{G}-measurable stochastic process.

The goal is to ensure that the inaccessible default time τ\tau enable us to enlarged the filtration to 𝔾\mathbb{G} and transferring the martingale property from 𝔽\mathbb{F} to 𝔾\mathbb{G} known as the immersion property or H-hypothesis. The first fundamental hypothesis is to set the existence of a (conditional) density for the default time with a certain property.

Hypothesis (Density Hypothesis).

For any t0t\geq 0, it exists a measurable map γ(t,)\gamma(t,\cdot) such that

(τx|t)=xγ(t,u)𝑑ux0\mathbb{P}(\tau\geq x|\mathcal{F}_{t})=\int_{x}^{\infty}\gamma(t,u)du\qquad\forall x\geq 0

and γ(t,u)=γ(u,u)𝟏tu\gamma(t,u)=\gamma(u,u)\mathbf{1}_{t\geq u}

As a consequence of this assumption, see for example [10, 18], any 𝔽\mathbb{F}-martingale is also a 𝔾\mathbb{G}-martingale. Furthermore, and still under the density hypothesis, the process HH admits an absolutely continuous compensator, i.e., there exists a non-negative 𝔾\mathbb{G}-predictable process λ𝔾\lambda^{\mathbb{G}}, such that the compensated process MM defined by

Mt:=Ht0tλs𝔾𝑑sM_{t}:=H_{t}-\int_{0}^{t}\lambda_{s}^{\mathbb{G}}ds

is a 𝔾\mathbb{G}-martingale. The compensator vanishes after time τ\tau (therefore λt𝔾:=λt𝟏tτ\lambda^{\mathbb{G}}_{t}:=\lambda_{t}\mathbf{1}_{t\leq\tau}) and

λt:=γ(t,t)(τ>t|t)\lambda_{t}:=\frac{\gamma(t,t)}{\mathbb{P}(\tau>t|\mathcal{F}_{t})}

is a 𝔽\mathbb{F}-predictable process. For a complete and deeper discussion on the properties of enlarged filtrations, we refer the reader to [21]. We set Λt:=0tλs𝑑s\Lambda_{t}:=\int_{0}^{t}\lambda_{s}ds.

As a consequence of Proposition 4.4 in [18]

(τ>t|t)=eΛt.\mathbb{P}(\tau>t|\mathcal{F}_{t})=e^{-\Lambda_{t}}.

We now turn to the integrability of the process λ\lambda and the support of the default time τ\tau. Denoting by 𝒯(𝔸)\mathcal{T}(\mathbb{A}) the set of 𝔸\mathbb{A}-stopping times (so we will have 𝒯(𝔽)\mathcal{T}(\mathbb{F}) or 𝒯(𝔾)\mathcal{T}(\mathbb{G})), we consider two cases

Hypothesis A - unbounded default.

esssupρ𝒯(𝒢)𝔼[ρTλsds|Gρ]<+\operatorname*{ess\,sup}_{\rho\in\mathcal{T}(\mathcal{G})}\mathbb{E}\Bigl{[}\int_{\rho}^{T}\lambda_{s}ds\,|\,G_{\rho}\Bigl{]}<+\infty (HA)

As a direct consequence of the tower property and since 𝔽𝔾\mathbb{F}\subseteq\mathbb{G}, Hypothesis A leads to

esssupρ𝒯(𝔾)𝔼[ρTλsds|ρ]<+\operatorname*{ess\,sup}_{\rho\in\mathcal{T}(\mathbb{G})}\mathbb{E}\Bigl{[}\int_{\rho}^{T}\lambda_{s}ds\,|\,\mathcal{F}_{\rho}\Bigl{]}<+\infty

Consequently, (τ[0,T])<1\mathbb{P}(\tau\in[0,T])<1, the support of τ\tau strictly contains [0,T][0,T].

Hypothesis B - bounded default

esssupρ𝒯(𝔾)𝔼[ρtλsds|𝒢ρ]<+t<T,𝔼[ΛT]=.\operatorname*{ess\,sup}_{\rho\in\mathcal{T}(\mathbb{G})}\mathbb{E}\Bigl{[}\int_{\rho}^{t}\lambda_{s}ds\,|\,\mathcal{G}_{\rho}\Bigl{]}<+\infty\quad\forall t<T,\qquad\mathbb{E}\Bigl{[}\Lambda_{T}\Bigl{]}=\infty. (HB)

Hence,

esssupρ𝒯(𝔾)𝔼[ρtλsds|ρ]<+t<T,𝔼[ΛT]=.\operatorname*{ess\,sup}_{\rho\in\mathcal{T}(\mathbb{G})}\mathbb{E}\Bigl{[}\int_{\rho}^{t}\lambda_{s}ds\,|\,\mathcal{F}_{\rho}\Bigl{]}<+\infty\quad\forall t<T,\qquad\mathbb{E}\Bigl{[}\Lambda_{T}\Bigl{]}=\infty.

Consequently, (τ[0,T])=1\mathbb{P}(\tau\in[0,T])=1, the support of τ\tau is included in [0,T][0,T].

2.3 Admissible strategy and contracts

Admissible strategies π\pi may be restricted to a closed or compact subset of m\mathbb{R}^{m}. We set the rigorous definition of an admissible strategy below, following [23, Definition 1].

Definition 2 (Admissible strategy with constraints).

Let CC be a closed set in 1×m\mathbb{R}^{1\times m}. The set of admissible strategy denoted by 𝒜\mathcal{A} consists of mm-dimensional predictable process π\pi such that 𝔼[0T|πt|2𝑑t]<\mathbb{E}[\int_{0}^{T}|\pi_{t}|^{2}dt]<\infty and πtC\pi_{t}\in C, dtdt\otimes\mathbb{P}-a.e.

Note that due to the nature of the problem considered and by considering a compensation ξ\xi, other integrability conditions are transferred to the admissibility of the contract below. Some example of set of admissible strategy includes:

  • C=[0,1]mC=[0,1]^{m}, that is π\pi is a proportion of the total wealth XπX^{\pi} that is invested in the portfolio with no possibility to borrow or spend more than the actual value of XπX^{\pi}. It is not permitting shorting stocks (i.e. selling stocks borrowed).

  • C=[1,1]mC=[-1,1]^{m}, which does not permitting to leverage positions up to a certain threshold.

  • C=[M,M]mC=[-M,M]^{m} assuming that the investor can spend or borrow as much money as needed limited to a fraction MM of the total wealth (possibly greater or less than 1 or 1-1).

For a symmetric positive definite matrix QQ, we define the norm of a column vector xx by

xQ=xTQx.||x||_{Q}=x^{T}Qx.

This norm is equivalent to the euclidean norm in n\mathbb{R}^{n}, and the constants of equivalence are the smallest and biggest eigenvalues of the matrix QQ. From this definition of QQ-norm, we define the QQ-distance of xx to the set CC as

distQ(x,C):=infyC{xyQ}.dist_{Q}(x,C):=\inf_{y\in C}\{||x-y||_{Q}\}.

The contract ξ\xi proposed by the investor follows the idea in the article [14]. We denote by η>0\eta>0 a risk aversion parameter for the portfolio manager with CARA exponential utility function UA(x)=eηx.U_{A}(x)=-e^{-\eta x}.

Definition 3 (Admissible contract with contractible variables).

We denote by Ξ\Xi the set of admissible contract ξ\xi composed by 𝒢τT\mathcal{G}_{\tau\wedge T}-measurable random variable ξ=YTτY0,Z,ZX,U,ΓX,Γ\xi=Y_{T\wedge\tau}^{Y_{0},Z,Z^{X},U,\Gamma^{X},\Gamma}, controlled by 𝔾\mathbb{G}-predictable real valued processes U,Z=(Zi)1im,ZX,ΓX,Γ=(Γi)1imU,Z=(Z^{i})_{1\leq i\leq m},Z^{X},\Gamma^{X},\Gamma=(\Gamma_{i})_{1\leq i\leq m} such that mΓtXσtσtT\mathcal{I}_{m}-\Gamma^{X}_{t}\sigma_{t}\sigma^{T}_{t} is a positive definite matrix222m\mathcal{I}_{m} denotes the identity matrix in dimension mm. and

YtY0,Z,ZX,U,ΓX,Γ\displaystyle Y_{t}^{Y_{0},Z,Z^{X},U,\Gamma^{X},\Gamma} =Y0+0ti=1mZriSridSri+0tZrX𝑑Xr+0tUr𝑑Hr\displaystyle=Y_{0}+\int_{0}^{t}\sum_{i=1}^{m}\frac{Z^{i}_{r}}{S^{i}_{r}}dS^{i}_{r}+\int_{0}^{t}Z_{r}^{X}dX_{r}+\int_{0}^{t}U_{r}dH_{r}
+120t(ΓrX+η(ZrX)2)dX,Xr+0ti=1mΓriStidSi,Xr\displaystyle+\frac{1}{2}\int_{0}^{t}(\Gamma_{r}^{X}+\eta(Z_{r}^{X})^{2})d\langle X,X\rangle_{r}+\int_{0}^{t}\sum_{i=1}^{m}\frac{\Gamma_{r}^{i}}{S^{i}_{t}}d\langle S^{i},X\rangle_{r}
0tF(r,Zr,ZrX,Γr,ΓrX)𝑑r,\displaystyle-\int_{0}^{t}F(r,Z_{r},Z_{r}^{X},\Gamma_{r},\Gamma_{r}^{X})dr,

where

F(t,z,zx,g,gx,u)=supνCf(t,z,zx,g,gx,u,ν),F(t,z,z_{x},g,g_{x},u)=\sup_{\nu\in C}f(t,z,z_{x},g,g_{x},u,\nu),

with f:[0,T]×Ω×m××m×××Cf:[0,T]\times\Omega\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R}\times C by

f(t,z,zx,g,gx,u,ν)\displaystyle f(t,z,z_{x},g,g_{x},u,\nu) =zbt+zxνσtθt+12gxνσt212ναt2+i=1mgiνiσti(σti)\displaystyle=zb_{t}+z_{x}\nu\sigma_{t}\theta_{t}+\frac{1}{2}g_{x}\|\nu\sigma_{t}\|^{2}-\frac{1}{2}\|\nu-\alpha_{t}\|^{2}+\sum_{i=1}^{m}g^{i}\nu^{i}\sigma_{t}^{i}(\sigma_{t}^{i})^{\top}
λtη(exp(ηu)1)η2zσt2,\displaystyle-\frac{\lambda_{t}}{\eta}(\exp{(-\eta u)-1})-\frac{\eta}{2}||z\sigma_{t}||^{2},

so that

F(t,zx,z,gx,g,u)=14qtQt1qt+zbt12αt2λtη(exp(ηUt)1)η2zσt2+distQ2(dt,C)F(t,z_{x},z,g_{x},g,u)=\frac{1}{4}q_{t}Q_{t}^{-1}q_{t}+zb_{t}-\frac{1}{2}||\alpha_{t}||^{2}-\frac{\lambda_{t}}{\eta}(\exp{(-\eta U_{t})-1})-\frac{\eta}{2}||z\sigma_{t}||^{2}+dist^{2}_{Q}(d_{t},C) (1)

where

Qt=12(gxσtσtT),qt=(σtθt+σtσtTg+αtηzxσtσtTz)Q_{t}=\frac{1}{2}(\mathcal{I}-g_{x}\sigma_{t}\sigma^{T}_{t}),\quad q_{t}=(\sigma_{t}\theta_{t}+\sigma_{t}\sigma^{T}_{t}g+\alpha_{t}-\eta z_{x}\sigma_{t}\sigma^{T}_{t}z)
dt=12qtQt1,d_{t}=\frac{1}{2}q_{t}Q_{t}^{-1},

and there exists η>η\eta^{\prime}>\eta such that

𝔼[0T(Zs2+ZsX2+Γs+ΓsX+Us2λs)𝑑s+sup0tTeη|YtY0,Z,ZX,U,Γ,ΓX|]<.\mathbb{E}\Big{[}\int_{0}^{T}(\|Z_{s}\|^{2}+\|Z^{X}_{s}\|^{2}+\|\Gamma_{s}\|+\|\Gamma^{X}_{s}\|+\|U_{s}\|^{2}\lambda_{s})ds+\sup_{0\leq t\leq T}e^{\eta^{\prime}|Y_{t}^{Y_{0},Z,Z^{X},U,\Gamma,\Gamma^{X}}|}\Big{]}<\infty.

The set of processes Z,ZX,U,Γ,ΓXZ,Z^{X},U,\Gamma,\Gamma^{X} satisfying this integrability condition is denoted by 𝒰\mathcal{U} while there restriction to [0,t][0,t] is denoted by 𝒰t\mathcal{U}_{t} for any t<Tt<T.

Economical interpretation.
  • Y0Y_{0} is a fixed compensation determined by the reservation utility of the portfolio manager;

  • the integrand process ZiZ^{i} represents a compensation with respect to the evolution of the ii-th asset SiS^{i}, if this one is observable by the investor and so contractible;

  • ZXZ^{X} is a compensation term with respect to the portfolio dynamic, always observable by the investor;

  • UU is a compensation with respect to the default risk of the market;

  • Γi\Gamma^{i} is a compensation with respect to the covariation of SiS^{i} and XX while the term ΓX+γ(ZX)2\Gamma^{X}+\gamma(Z^{X})^{2} is a compensation driven by the risk aversion of the manager with respect to the quadratic variation of the portfolio;

  • FF is the certain equivalent utility wins by the portfolio manager when solving her optimization problem. The gain resulting from this optimization is transferred into the contract.

Remark 2.

We can refine the set of contract depending on the information available for the investor (see [14]).

  • We denote by ΞΞ\Xi^{\circ}\subset\Xi the set of random variables ξ=YTY0,ZX,U,ΓX\xi=Y_{T}^{Y_{0},Z^{X},U,\Gamma^{X}} with Zi=Γi=0Z^{i}=\Gamma^{i}=0 corresponding to the case where SS is not observable by the investor and so not contractible.

  • The set of linear contract ΞlΞ\Xi^{l}\subset\Xi defined by

    ξ\displaystyle\xi =Y0+0Tτi=1mZsiSsidSsi+p(XTτX0)+0TτUs𝑑Hs+120Tτ(ΓsX+ηp2)dX,Xt\displaystyle=Y_{0}+\int_{0}^{T\wedge\tau}\sum_{i=1}^{m}\frac{Z^{i}_{s}}{S^{i}_{s}}dS^{i}_{s}+p(X_{T\wedge\tau}-X_{0})+\int_{0}^{T\wedge\tau}U_{s}dH_{s}+\frac{1}{2}\int_{0}^{T\wedge\tau}(\Gamma_{s}^{X}+\eta p^{2})d\langle X,X\rangle_{t}
    0TτF(s,Zs,p,Γs,ΓsX)𝑑s+0Tτi=1mΓsiSsidSi,Xt\displaystyle-\int_{0}^{T\wedge\tau}F(s,Z_{s},p,\Gamma_{s},\Gamma_{s}^{X})ds+\int_{0}^{T\wedge\tau}\sum_{i=1}^{m}\frac{\Gamma_{s}^{i}}{S^{i}_{s}}d\langle S^{i},X\rangle_{t}

    with contractible SS. The idea around this contract is that in practice, most of fund managers are asking as a compensation to their clients a fixed percentage of the terminal wealth, forcing ZsX=pZ_{s}^{X}=p in Ξ\Xi, where pp is the fixed percentage the agent is going to receive.

In all these cases, Z=(Zi)1im,ZX,U,Γ,ΓXZ=(Z^{i})_{1\leq i\leq m},Z^{X},U,\Gamma,\Gamma^{X} are predictable processes such that ξ𝒞\xi\in\mathcal{C} and all the stochastic integral are martingales.

Remark 3.

Note that the set of contract Ξ\Xi is stated without loss of generalities as explained in [15, 11] as soon as we are considering general contracts ξ\xi with exponential moment of any order, requiring to refine the definition of 𝒰\mathcal{U} in order to apply the existence results in [29, 25].

2.4 Delegated portfolio management and bi-level stochastic programming

We are assuming that the portfolio manager is receiving the contract ξ\xi and optimally chooses a strategy π\pi in order to stay close to a benchmark strategy α\alpha à la Almgren-Chriss, see [2] so that the objective of the manager is to solve for any contract ξΞ\xi\in\Xi fixed

V0A(x,ξ)=supπ𝒜JA(π;x,ξ),V^{A}_{0}(x,\xi)=\sup_{\pi\in\mathcal{A}}J^{A}(\pi;x,\xi), (A)

where

JA(π;x,ξ):=𝔼[UA(ξ0Tτπsαs2𝑑s)].J^{A}(\pi;x,\xi):=\mathbb{E}[U^{A}(\xi-\int_{0}^{T\wedge\tau}\|\pi_{s}-\alpha_{s}\|^{2}ds)].

In our model, the investor chooses a full delegation of the portfolio management to the manager and thus let the manager choosing the optimal strategy to optimize the profit terminal value of the portfolio under default. This second-best case and the contracting problem is reduced to solve a bi-level optimization under constraint as follows when SS is contractible

V0P(x):=sup(ξ,π^)𝒞×𝒜𝔼[UP(XTτξ)]V_{0}^{P}(x):=\sup_{(\xi,\hat{\pi})\in\mathcal{C}\times\mathcal{A}}\mathbb{E}[U_{P}(X_{T\wedge\tau}-\xi)] (P)

subject to

  • (R):  V0A(x,ξ)R0V_{0}^{A}(x,\xi)\geq R_{0}

  • (IC):  V0A(x,ξ)=JA(π^;x,ξ)V_{0}^{A}(x,\xi)=J^{A}(\hat{\pi};x,\xi).

We will refer to the first problem (A) as the Problem of the Agent, and as the Problem of the Principal the bi-level programming (P). Furthermore, we want to emphasize the fact that τ\tau does not need to be a stopping time in the natural filtration given by the assets’ dynamics. Despite this, most of the results regarding default time are based on stopping time theory so we want to work with a filtration such that τ\tau can be a stopping time.

3 Optimal contract and investment strategy

3.1 Optimal investment with random horizon: solving the agent problem

A common approach in the continuous stochastic optimisation literature is based on solution of Backward Stochastic Differential Equations (BSDEs) by solving a martingale optimality principle. We refer the reader to [23] for a detailed explanation of the method in a continuous setting and to [35, 29, 25] for extensions to discontinuous processes or default time. The general idea is to generate a family of super-martingales (Rπ)(R^{\pi}) indexed by the control variable, in our case the investment strategy π\pi, with terminal condition the objective function of the agent JAJ^{A} at time TτT\wedge\tau. If we are able to find a specific control π^\hat{\pi} such that Rπ^R^{\hat{\pi}} is a martingale, this control is optimal for (A), and the optimal value is given by the process indexed by the optimal control at time 0.

Lemma 1 (Martingale Optimality Principle).

Let ξΞ\xi\in\Xi and let (Rtτπ)t[0,T](R^{\pi}_{t\wedge\tau})_{t\in[0,T]} be a family of stochastic processes indexed by the strategy πA\pi\in A such that

  1. (i)

    RTτπ=UA(ξ0Tτπsαs2𝑑s)R^{\pi}_{T\wedge\tau}=U^{A}(\xi-\int_{0}^{T\wedge\tau}\|\pi_{s}-\alpha_{s}\|^{2}ds),

  2. (ii)

    RπR^{\pi} is a 𝔾\mathbb{G}-supermartingale and R0πR^{\pi}_{0} is constant for all π𝒜\pi\in\mathcal{A},

  3. (iii)

    there exists π^𝒜\hat{\pi}\in\mathcal{A} such that Rπ^R^{\hat{\pi}} is a 𝔾\mathbb{G}-martingale.

Then, π^\hat{\pi} is a solution to the maximization problem (A).

Proof.

Take π𝒜\pi\in\mathcal{A}. Then, we have

JA(π;x,ξ)=𝔼[RTτπ](i)𝔼[R0π]=(ii)𝔼[R0π]=(iii)𝔼[RTτπ]=JA(π^;x,ξ).\displaystyle J^{A}(\pi;x,\xi)=\mathbb{E}[R^{\pi}_{T\wedge\tau}]\overset{(i)}{\leq}\mathbb{E}[R^{\pi}_{0}]\overset{(ii)}{=}\mathbb{E}[R^{\pi^{*}}_{0}]\overset{(iii)}{=}\mathbb{E}[R^{\pi^{*}}_{T\wedge\tau}]=J^{A}(\hat{\pi};x,\xi).

Let ξΞ\xi\in\Xi be fixed. Independently of the boundness of the default time, that is either under Hypothesis A or Hypothesis B, we define

Rtτπ:=UA(Ytτπ0tτπsαs2𝑑s),R^{\pi}_{t\wedge\tau}:=U^{A}(Y^{\pi}_{t\wedge\tau}-\int_{0}^{t\wedge\tau}\|\pi_{s}-\alpha_{s}\|^{2}ds),

where YπY^{\pi} is defined by

Ytτπ\displaystyle Y^{\pi}_{t\wedge\tau} =Y0+0tτi=1mZsiSsidSsi+0tτZsX𝑑Xs+0tτUs𝑑Hs\displaystyle=Y_{0}+\int_{0}^{t\wedge\tau}\sum_{i=1}^{m}\frac{Z^{i}_{s}}{S^{i}_{s}}dS^{i}_{s}+\int_{0}^{t\wedge\tau}Z_{s}^{X}dX_{s}+\int_{0}^{t\wedge\tau}U_{s}dH_{s}
+120tτ(ΓsX+η(ZsX)2)dX,Xs0tτF(s,Zs,ZsX,Γs,ΓsX,πs)𝑑s\displaystyle+\frac{1}{2}\int_{0}^{t\wedge\tau}(\Gamma_{s}^{X}+\eta(Z_{s}^{X})^{2})d\langle X,X\rangle_{s}-\int_{0}^{t\wedge\tau}F(s,Z_{s},Z_{s}^{X},\Gamma_{s},\Gamma_{s}^{X},\pi_{s})ds
+0tτi=1mΓsiSsidSsidSi,Xs\displaystyle+\int_{0}^{t\wedge\tau}\sum_{i=1}^{m}\frac{\Gamma^{i}_{s}}{S^{i}_{s}}dS^{i}_{s}d\langle S^{i},X\rangle_{s}
Theorem 1.

Assume that Hypothesis and either Hypothesis A or Hypothesis B are satisfied. For any ξΞ\xi\in\Xi, the optimal strategy solving (A) is

π^t=π(Zt,ZtX,Γt,ΓtX), with π(z,zx,g,gx):=proj(et,C)\hat{\pi}_{t}=\pi^{*}(Z_{t},Z_{t}^{X},\Gamma_{t},\Gamma^{X}_{t}),\text{ with }\pi^{*}(z,z_{x},g,g_{x}):=proj(e_{t},C) (2)

and the optimal value is given by V0A(x,ξ)=eηY0V^{A}_{0}(x,\xi)=-e^{-\eta Y_{0}}, where

et=12qtQt1,qt=(σtθt+σtσtTΓt+αtηZtXσtσtTZtT)e_{t}=\frac{1}{2}q_{t}Q_{t}^{-1},\;q_{t}=(\sigma_{t}\theta_{t}+\sigma_{t}\sigma^{T}_{t}\Gamma_{t}+\alpha_{t}-\eta Z_{t}^{X}\sigma_{t}\sigma^{T}_{t}Z^{T}_{t})

and

Qt=12(ΓtXσtσtT).Q_{t}=\frac{1}{2}(\mathcal{I}-\Gamma^{X}_{t}\sigma_{t}\sigma^{T}_{t}).
Proof.

The proof follows the same idea of [23] for the continuous case and [25, 35] for the discontinuous case extending to the multi-dimensional case and the contract ξ\xi fixed by the principal. Note that Z=(Zi)i=1mZ=(Z^{i})_{i=1}^{m} and Γ=(Γi)i=1m\Gamma=(\Gamma^{i})_{i=1}^{m} are already m\mathbb{R}^{m} stochastic (row) vectors and that we are going to denote by CC the set of admissible trading strategies π=(πt)t[0,T]\pi=(\pi_{t})_{t\in[0,T]}. The proof will be based on Ito’s formula with Poisson jumps (as the intensity of our jump is the same of a simple Poisson process with varying intensity λ\lambda) and we refer to [41, 24], for more details on stochastic calculus with semi-martingales; and the notion of Doléans-Dade exponential process (DDE). Given a semi-martingale PP, we denote its DDE by

(P)t=exp{Pt12Pct}0st{(1+ΔsP)exp(ΔsP)},\mathcal{E}(P)_{t}=\exp\{P_{t}-\frac{1}{2}\langle P^{c}\rangle_{t}\}\prod_{0\leq s\leq t}\{(1+\Delta_{s}P)\exp(-\Delta_{s}P)\},

where PcP^{c} is the continuous component of the path of PP while ΔsP:=PsPs\Delta_{s}P:=P_{s}-P_{s-}. Note that the DDE of PP is a solution of the following SDE dYt=YtdXtdY_{t}=Y_{t-}dX_{t} and that, provided PP is a BMOBMO-martingale, the resulting DDE is a (local) martingale as well.

To find the optimal solution, we are applying the so-called martingale optimality principle. We define the following family of stochastic processes, indexed by the strategy π\pi

Rtπ=UA(Yt0tπsαs2𝑑s).R^{\pi}_{t}=U^{A}(Y_{t}-\int_{0}^{t}||\pi_{s}-\alpha_{s}||^{2}ds).

We set Btπ:=Yt0tπsαs2𝑑sB^{\pi}_{t}:=Y_{t}-\int_{0}^{t}||\pi_{s}-\alpha_{s}||^{2}ds, so that Rtπ=exp(ηBtπ)R^{\pi}_{t}=-\exp{(-\eta B^{\pi}_{t})}. Note that the Ito’s decomposition of BπB^{\pi} is given by

dBtπ\displaystyle dB^{\pi}_{t} =i=1mZtStidSti+ZtXdXt+UtdHt+12(ΓtX+η(ZtX)2)dX,Xt\displaystyle=\sum_{i=1}^{m}\frac{Z_{t}}{S^{i}_{t}}dS^{i}_{t}+Z_{t}^{X}dX_{t}+U_{t}dH_{t}+\frac{1}{2}(\Gamma_{t}^{X}+\eta(Z_{t}^{X})^{2})d\langle X,X\rangle_{t}
(F(t,Zt,ZtX,Γt,ΓtX)+πtαt2)dt+i=1mΓtiStidSi,Xt,t[0,Tτ].\displaystyle-(F(t,Z_{t},Z_{t}^{X},\Gamma_{t},\Gamma_{t}^{X})+||\pi_{t}-\alpha_{t}||^{2})dt+\sum_{i=1}^{m}\frac{\Gamma^{i}_{t}}{S^{i}_{t}}d\langle S^{i},X\rangle_{t},\;t\in[0,T\wedge\tau].

Hence,

dBtπ\displaystyle dB^{\pi}_{t} =Jtπdt+(Ztσt+ZtXπtσt)dWt+UtdHt,\displaystyle=J^{\pi}_{t}dt+(Z_{t}\sigma_{t}+Z^{X}_{t}\pi_{t}\sigma_{t})dW_{t}+U_{t}dH_{t},

where

Jtπ=Ztbt+ZtXπtσtθt+12(ΓtX+η(ZtX)2)||πtσt||2F(t,Zt,ZtX,Γt,ΓtX)+πtσtσtTΓt12||πtαt||2).J^{\pi}_{t}=Z_{t}b_{t}+Z^{X}_{t}\pi_{t}\sigma_{t}\theta_{t}+\frac{1}{2}(\Gamma^{X}_{t}+\eta(Z^{X}_{t})^{2})||\pi_{t}\sigma_{t}||^{2}-F(t,Z_{t},Z_{t}^{X},\Gamma_{t},\Gamma_{t}^{X})+\pi_{t}\sigma_{t}\sigma^{T}_{t}\Gamma_{t}-\frac{1}{2}||\pi_{t}-\alpha_{t}||^{2}).

We thus deduce that

dRtπ\displaystyle dR^{\pi}_{t} =ηRtπJtπdt+η22Rtπ(Ztσt+ZtXπσt)dWt+Rtπ(exp(ηUt)1)dMt\displaystyle=-\eta R^{\pi}_{t}J^{\pi}_{t}dt+\frac{\eta^{2}}{2}R^{\pi}_{t}(Z_{t}\sigma_{t}+Z^{X}_{t}\pi\sigma_{t})dW_{t}+R^{\pi}_{t_{-}}(\exp{(-\eta U_{t})-1})dM_{t}

where we used the fact that dHt=dMt+λtdtdH_{t}=dM_{t}+\lambda_{t}dt. Note that R0=exp(ηY0)<0R_{0}=-\exp{(-\eta Y_{0})}<0 and Jπ0J^{\pi}\leq 0 and equals to zero when π\pi is chosen by maximizing

πtQtπtT+πtqT+dt-\pi_{t}Q_{t}\pi_{t}^{T}+\pi_{t}q^{T}+d_{t}

with

Qt=12(ΓtXσtσtT),Q_{t}=\frac{1}{2}(\mathcal{I}-\Gamma^{X}_{t}\sigma_{t}\sigma^{T}_{t}),
qt=(σtθt+σtσtTΓt+αtηZtXσtσtTZtT),q_{t}=(\sigma_{t}\theta_{t}+\sigma_{t}\sigma^{T}_{t}\Gamma_{t}+\alpha_{t}-\eta Z_{t}^{X}\sigma_{t}\sigma^{T}_{t}Z^{T}_{t}),

and

dt=ZtbtFη2Ztσt212αt2λtη(exp(ηUt)1).d_{t}=Z_{t}b_{t}-F-\frac{\eta}{2}||Z_{t}\sigma_{t}||^{2}-\frac{1}{2}||\alpha_{t}||^{2}-\frac{\lambda_{t}}{\eta}(\exp{(-\eta U_{t})-1}).

Therefore,

π(Zt,ZtX,Γt,ΓtX)=proj(et,C)\pi^{*}(Z_{t},Z_{t}^{X},\Gamma_{t},\Gamma^{X}_{t})=proj(e_{t},C)

so that π\pi^{*} is optimal and unique because our set CC is compact. ∎

Remark 4.

Note that if m=d=1m=d=1 and C=C=\mathbb{R} and σ\sigma and bb are constant we get

F(t,z,zx,g,gx,u)\displaystyle F(t,z,z_{x},g,g_{x},u) =12(zxσθαsηzzxσ2+σ2g1gxσ2)2+zb\displaystyle=\frac{1}{2}\Bigl{(}\frac{z_{x}\sigma\theta-\alpha_{s}-\eta zz_{x}\sigma^{2}+\sigma^{2}g}{\sqrt{1-g_{x}\sigma^{2}}}\Bigl{)}^{2}+zb
12αt2λtη((eηu1))η2z2σ2.\displaystyle-\frac{1}{2}\alpha_{t}^{2}-\frac{\lambda_{t}}{\eta}((e^{-\eta u}-1))-\frac{\eta}{2}z^{2}\sigma^{2}.
Remark 5.

As a consequence of the constraint (IC), we implicitly are looking for contract such that π^𝒰\hat{\pi}\in\mathcal{U}. This requires to consider (Z,ZX,Γ,ΓX,U)(Z,Z^{X},\Gamma,\Gamma^{X},U) such that π^=π(Z,ZX,Γ,ΓX)𝒰\hat{\pi}=\pi^{*}(Z,Z^{X},\Gamma,\Gamma^{X})\in\mathcal{U}, otherwise (IC) is not satisfied and there is no optimal contracts.

3.2 The optimal contract and verification results

In the previous chapter, we found the optimal strategy π\pi^{*} for the agent under the best optimal contract with contractible SS. Given that strategy, we know want to solve the problem for the principal of (P), i.e, we want to find the compensation ξ\xi to maximise the principal utility:

V0=supξΞ s.t V0A(x)R𝔼[UP(XTτπξ)]V_{0}=\sup_{\xi\in\Xi\text{ s.t }V_{0}^{A}(x)\geq R}\mathbb{E}[U^{P}(X_{T\wedge\tau}^{\pi^{*}}-\xi)] (3)

where we used XTτπX_{T\wedge\tau}^{\pi^{*}} to stress that now the object of our maximisation problem is dependent on the best agent strategy π\pi^{*}. Before diving into the intuition behind how to solve this problem, it is important to show how this becomes a control problem so that an HJB equation can be derived. To do so, we first want to work on ξ s.t V0A(x)R\xi\text{ s.t }V_{0}^{A}(x)\geq R. We recall that V0A(x)=exp(η(Y0))V_{0}^{A}(x)=-\exp(-\eta(Y_{0})), so that

V0A(x)RY0log(R)η:=Y^0.V_{0}^{A}(x)\geq R\Longleftrightarrow Y_{0}\geq-\frac{\log(-R)}{\eta}:=\hat{Y}_{0}.

Moreover,

ξ=YTτπ\displaystyle\xi=Y^{\pi}_{T\wedge\tau} =Y0+0Tτi=1mZtiStidSti+0TτZtX𝑑Xt+0tτUs𝑑Hs\displaystyle=Y_{0}+\int_{0}^{T\wedge\tau}\sum_{i=1}^{m}\frac{Z^{i}_{t}}{S^{i}_{t}}dS^{i}_{t}+\int_{0}^{T\wedge\tau}Z_{t}^{X}dX_{t}+\int_{0}^{t\wedge\tau}U_{s}dH_{s}
+120Tτ(ΓtX+η(ZtX)2)dX,Xt0TτF(t,Zt,ZtX,Γt,ΓtX,πt)𝑑t\displaystyle+\frac{1}{2}\int_{0}^{T\wedge\tau}(\Gamma_{t}^{X}+\eta(Z_{t}^{X})^{2})d\langle X,X\rangle_{t}-\int_{0}^{T\wedge\tau}F(t,Z_{t},Z_{t}^{X},\Gamma_{t},\Gamma_{t}^{X},\pi^{*}_{t})dt
+0Tτi=1mΓsiStidSi,Xt\displaystyle+\int_{0}^{T\wedge\tau}\sum_{i=1}^{m}\frac{\Gamma^{i}_{s}}{S^{i}_{t}}d\langle S^{i},X\rangle_{t}

Therefore, we can rewrite (3) as

V0\displaystyle V_{0} =sup(Z,ZX,Γ,ΓX,U)𝒰𝔼[UP(XTτπY^Tτ)]Y0^,\displaystyle=\sup_{(Z,Z^{X},\Gamma,\Gamma^{X},U)\in\mathcal{U}}\mathbb{E}[U^{P}(X_{T\wedge\tau}^{\pi^{*}}-\hat{Y}_{T\wedge\tau})]-\hat{Y_{0}}, (4)

where Y^Tτ=YTτ0,Z,ZX,U,Γ,ΓX\hat{Y}_{T\wedge\tau}=Y_{T\wedge\tau}^{0,Z,Z^{X},U,\Gamma,\Gamma^{X}}. In order to derive the HJB equation with control process (Z,ZX,Γ,ΓX,U)(Z,Z^{X},\Gamma,\Gamma^{X},U), we will use the results from [9] assuming that τ\tau has a probability density function ff and a cumulative density function FF either if the support of τ\tau is bounded in [0,T][0,T] or unbounded in the next two subsections. We recall that the problem of the principal is reduced to solve

V^0:=sup(Z,ZX,Γ,ΓX,U)𝒰𝔼[UP(XTτπY^Tτ)].\hat{V}_{0}:=\sup_{(Z,Z^{X},\Gamma,\Gamma^{X},U)\in\mathcal{U}}\mathbb{E}[U^{P}(X_{T\wedge\tau}^{\pi^{*}}-\hat{Y}_{T\wedge\tau})].

To derive the HJB equation from this problem, we set an additional assumption enforcing Markovian properties of the drift and volatility processes.

Assumption (M). We assume that there exists two 𝔽\mathbb{F}-progressive measurable functions b,σb,\sigma such that bt=b(t,St)b_{t}=b(t,S_{t}) and σt=σ(t,St)\sigma_{t}=\sigma(t,S_{t}).

We define the following system of coupled SDE with jumps with solution (Xπ,Y^)(X^{\pi^{*}},\hat{Y}) controlled by (Z,ZX,Γ,ΓX,U)(Z,Z^{X},\Gamma,\Gamma^{X},U)

(SDE){dXtπ=πtσ(t,Xt)dWs+πtb(t,St)dtdY^t=i=1mZtiStidSti+ZtXdXtπ+UtdHt+12(ΓtX+η(ZtX)2)dXπ,XπtF(t,Zt,ZtX,Γt,ΓtX,πt)dt+i=1mΓtiStidStidSi,XπtX0π=x,Y^0=0.(SDE)\begin{cases}&dX^{\pi^{*}}_{t}=\pi_{t}^{*}\sigma(t,X_{t})dW_{s}+\pi_{t}^{*}b(t,S_{t})dt\\ &d\hat{Y}_{t}=\sum_{i=1}^{m}\frac{Z^{i}_{t}}{S^{i}_{t}}dS^{i}_{t}+Z_{t}^{X}dX^{\pi^{*}}_{t}+U_{t}dH_{t}+\frac{1}{2}(\Gamma_{t}^{X}+\eta(Z_{t}^{X})^{2})d\langle X^{\pi^{*}},X^{\pi^{*}}\rangle_{t}\\ &\qquad-F(t,Z_{t},Z_{t}^{X},\Gamma_{t},\Gamma_{t}^{X},\pi_{t})dt+\sum_{i=1}^{m}\frac{\Gamma^{i}_{t}}{S^{i}_{t}}dS^{i}_{t}d\langle S^{i},X^{\pi^{*}}\rangle_{t}\\ &X^{\pi^{*}}_{0}=x,\\ &\hat{Y}_{0}=0.\end{cases}

For the sake of computational simplicity and to avoid overwhelming notation, we will assume from now on that the principal is risk neutral, so that UP(x)=xU^{P}(x)=x.

3.2.1 Bounded default time

Under Hypothesis B, the support of τ\tau is included in [0,T][0,T]. Recalling the results in [9], we deduce that

V^0=sup(Z,ZX,Γ,ΓX,U)𝒰𝔼[0T(XtπY^t)f(t)dt],\hat{V}_{0}=\sup_{(Z,Z^{X},\Gamma,\Gamma^{X},U)\in\mathcal{U}}\;\mathbb{E}\bigl{[}\int_{0}^{T}(X_{t}^{\pi^{*}}-\hat{Y}_{t})f(t)dt\bigl{]},

where ff is the density of τ\tau. We introduce the following Hamilton-Jacobi-Bellman integro-partial differential equation.

(bHJB){tϕ(t,s,x,y)+(xy)f(t)+sup(zx,z,gx,g,u){ϕ(t,s,x,y,zx,z,gx,g,u)}=0,t<Tϕ(T,s,x,y)=0,(s,x,y)m××.\textbf{(bHJB)}\quad\begin{cases}&\partial_{t}\phi(t,s,x,y)+(x-y)f(t)+\sup_{(z^{x},z,g^{x},g,u)}\bigl{\{}\mathcal{H}^{\phi}(t,s,x,y,z^{x},z,g^{x},g,u)\bigl{\}}=0,\;t<T\\ &\phi(T,s,x,y)=0,\;(s,x,y)\in\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R}.\end{cases}

where ϕ\mathcal{H}^{\phi} is a differential operator given by

ϕ(t,s,x,zx,z,gx,g,u):=\displaystyle\mathcal{H}^{\phi}(t,s,x,z^{x},z,g^{x},g,u):=
i=1mϕsisibi(t,s)+ϕxπ(z,zx,g,gx)σ(t,s)θ(t,s)\displaystyle\sum_{i=1}^{m}\phi_{s^{i}}s^{i}b^{i}(t,s)+\phi_{x}\pi^{*}(z,z^{x},g,g^{x})\sigma(t,s)\theta(t,s)
+12Tr(sσ(t,s)σT(t,s)sD2ϕ)+12ϕxxπ(z,zx,g,gx)σ(t,s)2\displaystyle+\frac{1}{2}Tr(s\sigma(t,s)\sigma^{T}(t,s)sD^{2}\phi)+\frac{1}{2}\phi_{xx}||\pi^{*}(z,z^{x},g,g^{x})\sigma(t,s)||^{2}
+ϕy[zb(t,s)+zxπ(z,zx,g,gx)σ(t,s)θ(t,s)+12||σ(t,s)π(z,zx,g,gx)||2(gx+η|zx|2)\displaystyle+\phi_{y}\Big{[}zb(t,s)+z^{x}\pi^{*}(z,z^{x},g,g^{x})\sigma(t,s)\theta(t,s)+\frac{1}{2}||\sigma(t,s)\pi^{*}(z,z^{x},g,g^{x})||^{2}(g^{x}+\eta|z^{x}|^{2})
F(t,z,zx,g,gx,π(z,zx,g,gx))+π(z,zx,g,gx)σ(t,s)σ(t,s)Tg]\displaystyle-F(t,z,z^{x},g,g^{x},\pi^{*}(z,z^{x},g,g^{x}))+\pi^{*}(z,z^{x},g,g^{x})\sigma(t,s)\sigma(t,s)^{T}g\Big{]}
+12ϕyy(ztσ(t,s)2+zxσ(t,s)π(z,zx,g,gx)2)\displaystyle+\frac{1}{2}\phi_{yy}(||z_{t}\sigma(t,s)||^{2}+z^{x}||\sigma(t,s)\pi^{*}(z,z^{x},g,g^{x})||^{2})
+ϕxy(zxπ(z,zx,g,gx)σ(t,s)2+π(z,zx,g,gx)σ(t,s)(zσ)T)\displaystyle+\phi_{xy}(z^{x}||\pi^{*}(z,z^{x},g,g^{x})\sigma(t,s)||^{2}+\pi^{*}(z,z^{x},g,g^{x})\sigma(t,s)(z\sigma)^{T})
+i=1m(π(z,zx,g,gx)σ(t,s)σi(t,s)Tsi)(ϕxsi+ϕysizx)+λt(ϕ(t,s,x,y+u)ϕ(t,s,x,y))\displaystyle+\sum_{i=1}^{m}(\pi^{*}(z,z^{x},g,g^{x})\sigma(t,s)\sigma^{i}(t,s)^{T}s^{i})(\phi_{xs^{i}}+\phi_{ys^{i}}z^{x})+\lambda_{t}(\phi(t,s,x,y+u)-\phi(t,s,x,y))
+i=1mϕysiziσ(t,s)σi(t,s)Tsi.\displaystyle+\sum_{i=1}^{m}\phi_{ys^{i}}z^{i}\sigma(t,s)\sigma^{i}(t,s)^{T}s^{i}.

We define

Z^t:=z^(t,St,Xtπ,Y^t),Z^tX=z^x(t,St,Xtπ,Y^t),U^t=u^(t,St,Xtπ,Y^t),\hat{Z}_{t}:=\hat{z}(t,S_{t},X^{\pi^{*}}_{t},\hat{Y}_{t}),\;\hat{Z}_{t}^{X}=\hat{z}_{x}(t,S_{t},X^{\pi^{*}}_{t},\hat{Y}_{t}),\;\hat{U}_{t}=\hat{u}(t,S_{t},X^{\pi^{*}}_{t},\hat{Y}_{t}),
Γ^t=g^(t,St,Xtπ,Y^t),Γ^tX=g^x(t,St,Xtπ,Y^t),\hat{\Gamma}_{t}=\hat{g}(t,S_{t},X^{\pi^{*}}_{t},\;\hat{Y}_{t}),\hat{\Gamma}_{t}^{X}=\hat{g}_{x}(t,S_{t},X^{\pi^{*}}_{t},\hat{Y}_{t}),

where z^,z^x,g^,g^x,u^\hat{z},\hat{z}_{x},\hat{g},\hat{g}_{x},\hat{u} optimizes ϕ\mathcal{H}^{\phi}.

Theorem 2 (Verification Theorem - bounded case).

Assume that there exists a function ϕ\phi twice continuously differentiable in space and differentiable in time, such that ϕ(t,s,x,y)\phi(t,s,x,y) solves (bHJB). Furthermore, assume that ϕ\phi has a quadratic growth in yy and polynomial growth in s,xs,x such that

|ϕ(t,s,x,y)|κ(1+|x|p+sp+|y|2),p>1,κ>0.|\phi(t,s,x,y)|\leq\kappa(1+|x|^{p}+\|s\|^{p}+|y|^{2}),\;p>1,\;\kappa>0.

Then, for each t[0,T]t\in[0,T], the strategy (Z^X,Z^,Γ^X,Γ^,U^)(\hat{Z}^{X},\hat{Z},\hat{\Gamma}^{X},\hat{\Gamma},\hat{U}) is an optimal strategy for the control problem and

ϕ(0,S0,x,0)=V0=sup(ZX,Z,ΓX,Γ,U)𝔼[XTτπY^Tτ]Y0^.\displaystyle\phi(0,S_{0},x,0)=V_{0}=\sup_{(Z^{X},Z,\Gamma^{X},\Gamma,U)}\mathbb{E}\left[X_{T\wedge\tau}^{\pi^{*}}-\hat{Y}_{T\wedge\tau}\right]-\hat{Y_{0}}.

The optimal contract is given by

ξ\displaystyle\xi^{\star} =Y^0+0Tτi=1mZ^tiStidSti+0TτZ^tX𝑑Xt+0TτU^s𝑑Hs\displaystyle=\hat{Y}_{0}+\int_{0}^{T\wedge\tau}\sum_{i=1}^{m}\frac{\hat{Z}^{i}_{t}}{S^{i}_{t}}dS^{i}_{t}+\int_{0}^{T\wedge\tau}\hat{Z}_{t}^{X}dX_{t}+\int_{0}^{T\wedge\tau}\hat{U}_{s}dH_{s}
+120Tτ(Γ^tX+η(Z^tX)2)dX,Xt0TτF(t,Z^t,Z^tX,Γ^t,Γ^tX,πt)𝑑t\displaystyle+\frac{1}{2}\int_{0}^{T\wedge\tau}(\hat{\Gamma}_{t}^{X}+\eta(\hat{Z}_{t}^{X})^{2})d\langle X,X\rangle_{t}-\int_{0}^{T\wedge\tau}F(t,\hat{Z}_{t},\hat{Z}_{t}^{X},\hat{\Gamma}_{t},\hat{\Gamma}_{t}^{X},\pi^{*}_{t})dt
+0Tτi=1mΓ^siStidSi,Xt\displaystyle+\int_{0}^{T\wedge\tau}\sum_{i=1}^{m}\frac{\hat{\Gamma}^{i}_{s}}{S^{i}_{t}}d\langle S^{i},X\rangle_{t}
Remark 6.

If we assume that σ(t,s)=σm×d\sigma(t,s)=\sigma\in\mathbb{R}^{m\times d} and b(t,s)=bmb(t,s)=b\in\mathbb{R}^{m}, we note that the solution vv to the HJB equations (bHJB) does not depend on ss. It can thus be rewritten as

(bHJB){tv(t,x,y)+(xy)f(t)+sup(zx,z,gx,g,u){v(t,x,y,v,Δv,zx,z,gx,g,u)}=0,t<Tv(T,x,y)=0,(x,y)×,\textbf{(bHJB)}\quad\begin{cases}\partial_{t}v(t,x,y)+(x-y)f(t)+\sup_{(z^{x},z,g^{x},g,u)}\bigl{\{}\mathcal{H}^{v}(t,x,y,\nabla v,\Delta v,z^{x},z,g^{x},g,u)\bigl{\}}=0,\;t<T\\ v(T,x,y)=0,\;(x,y)\in\mathbb{R}\times\mathbb{R},\end{cases}

where

where

ϕ(t,x,y,zx,z,gx,g,u):=ϕxπ(z,zx,g,gx)σθ+12ϕxxπ(z,zx,g,gx)σ2\displaystyle\mathcal{H}^{\phi}(t,x,y,z^{x},z,g^{x},g,u):=\phi_{x}\pi^{*}(z,z^{x},g,g^{x})\sigma\theta+\frac{1}{2}\phi_{xx}||\pi^{*}(z,z^{x},g,g^{x})\sigma||^{2}
+ϕy[zTb+zxπ(z,zx,g,gx)σθ+12||π(z,zx,g,gx)σ||2(gx+η|zx|2)\displaystyle+\phi_{y}\Big{[}z^{T}b+z^{x}\pi^{*}(z,z^{x},g,g^{x})\sigma\theta+\frac{1}{2}||\pi^{*}(z,z^{x},g,g^{x})\sigma||^{2}(g^{x}+\eta|z^{x}|^{2})
F(t,z,zx,g,gx,π(z,zx,g,gx))+π(z,zx,g,gx)σσTg]\displaystyle-F(t,z,z^{x},g,g^{x},\pi^{*}(z,z^{x},g,g^{x}))+\pi^{*}(z,z^{x},g,g^{x})\sigma\sigma^{T}g\Big{]}
+12ϕyy(zTσ2+zxπ(z,zx,g,gx)σ2)\displaystyle+\frac{1}{2}\phi_{yy}(||z^{T}\sigma||^{2}+z^{x}||\pi^{*}(z,z^{x},g,g^{x})\sigma||^{2})
+ϕxy(zxπ(z,zx,g,gx)σ2+π(z,zx,g,gx)σσTz)\displaystyle+\phi_{xy}(z^{x}||\pi^{*}(z,z^{x},g,g^{x})\sigma||^{2}+\pi^{*}(z,z^{x},g,g^{x})\sigma\sigma^{T}z)
+λt(ϕ(t,x,y+u)ϕ(t,x,y)).\displaystyle+\lambda_{t}(\phi(t,x,y+u)-\phi(t,x,y)).
Proof.

The proof of the theorem relies on a localization procedure. We first assume that there exists a solution ϕ\phi to (uHJB) satisfying

|ϕ(t,s,x,y)|κ(1+|x|p+sp+|y|2),p>1,κ>0.|\phi(t,s,x,y)|\leq\kappa(1+|x|^{p}+\|s\|^{p}+|y|^{2}),\;p>1,\;\kappa>0.

Let (Z,ZX,Γ,ΓX,U)𝒰(Z,Z^{X},\Gamma,\Gamma^{X},U)\in\mathcal{U} We introduce the following stopping time

τn:=inf{t>0,(St,Xtπ,Y^t,ZX,λt)n(0)}T\tau_{n}:=\inf\{t>0,(S_{t},X^{\pi^{*}}_{t},\hat{Y}_{t},Z^{X},\lambda_{t})\in\mathcal{B}_{n}(0)\}\wedge T

where n(0)\mathcal{B}_{n}(0) is the centered ball of radius nn in m+4\mathbb{R}^{m+4}. Applying Ito’s formula, we get

ϕ(τn,Sτn,Xτnπ,Y^τn)\displaystyle\phi(\tau_{n},S_{\tau_{n}},X^{\pi^{*}}_{\tau_{n}},\hat{Y}_{\tau_{n}}) =ϕ(0,S0,x,0)+0τnϕ(r,Sr,Xrπ,Y^r,ZrX,Zr,ΓrX,Γr,Ur)𝑑r\displaystyle=\phi(0,S_{0},x,0)+\int_{0}^{\tau_{n}}\mathcal{H}^{\phi}(r,S_{r},X^{\pi^{*}}_{r},\hat{Y}_{r},Z^{X}_{r},Z_{r},\Gamma^{X}_{r},\Gamma_{r},U_{r})dr
+0τni=1msiϕ(r,Sr,Xrπ,Y^r)SriσridWr\displaystyle+\int_{0}^{\tau_{n}}\sum_{i=1}^{m}\partial_{s^{i}}\phi(r,S_{r},X^{\pi^{*}}_{r},\hat{Y}_{r})S_{r}^{i}\sigma^{i}_{r}dW_{r}
+0τnxϕ(r,Sr,Xrπ,Y^r)πrσrdWr\displaystyle+\int_{0}^{\tau_{n}}\partial_{x}\phi(r,S_{r},X^{\pi^{*}}_{r},\hat{Y}_{r})\pi_{r}^{*}\sigma_{r}dW_{r}
+0τnyϕ(r,Sr,Xrπ,Y^r)(i=1mZriσri+ZrXπrσr)dWr\displaystyle+\int_{0}^{\tau_{n}}\partial_{y}\phi(r,S_{r},X^{\pi^{*}}_{r},\hat{Y}_{r})(\sum_{i=1}^{m}Z^{i}_{r}\sigma^{i}_{r}+Z_{r}^{X}\pi_{r}^{*}\sigma_{r})dW_{r}
+0τnλr(ϕ(r,Sr,Xrπ,Y^r+Ur)ϕ(r,Sr,Xrπ,Y^r))𝑑Mr.\displaystyle+\int_{0}^{\tau_{n}}\lambda_{r}(\phi(r,S_{r},X^{\pi^{*}}_{r},\hat{Y}_{r}+U_{r})-\phi(r,S_{r},X^{\pi^{*}}_{r},\hat{Y}_{r}))dM_{r}.

By the localisation procedure and considering the expectations, we get

𝔼[ϕ(τn,Sτn,Xτnπ,Y^τn)]\displaystyle\mathbb{E}[\phi(\tau_{n},S_{\tau_{n}},X^{\pi^{*}}_{\tau_{n}},\hat{Y}_{\tau_{n}})]
=ϕ(0,S0,x,0)𝔼[0τn(XrπY^r)f(r)𝑑r]\displaystyle=\phi(0,S_{0},x,0)-\mathbb{E}[\int_{0}^{\tau_{n}}(X^{\pi^{*}}_{r}-\hat{Y}_{r})f(r)dr]
+𝔼[0τn(ϕ(r,Sr,Xrπ,Y^r,ZrX,Zr,ΓrX,Γr,Ur)+tϕ(r,Sr,Xrπ,Y^r)+(XrπY^r)f(r))𝑑r].\displaystyle+\mathbb{E}[\int_{0}^{\tau_{n}}\mathcal{(}\mathcal{H}^{\phi}(r,S_{r},X^{\pi^{*}}_{r},\hat{Y}_{r},Z^{X}_{r},Z_{r},\Gamma^{X}_{r},\Gamma_{r},U_{r})+\partial_{t}\phi(r,S_{r},X^{\pi^{*}}_{r},\hat{Y}_{r})+(X^{\pi^{*}}_{r}-\hat{Y}_{r})f(r))dr].

Since ϕ\phi satisfies (uHJB),

𝔼[ϕ(τn,Sτn,Xτnπ,Y^τn)+0τn(XrπY^r)f(r)𝑑r]\displaystyle\mathbb{E}[\phi(\tau_{n},S_{\tau_{n}},X^{\pi^{*}}_{\tau_{n}},\hat{Y}_{\tau_{n}})+\int_{0}^{\tau_{n}}(X^{\pi^{*}}_{r}-\hat{Y}_{r})f(r)dr] ϕ(0,S0,x,0)(ZX,Z,ΓX,Γ,U)𝒰\displaystyle\leq\phi(0,S_{0},x,0)\quad\forall(Z^{X},Z,\Gamma^{X},\Gamma,U)\in\mathcal{U}

where the equality holds for (Z^X,Z^,Γ^x,Γ^,U^)(\hat{Z}^{X},\hat{Z},\hat{\Gamma}^{x},\hat{\Gamma},\hat{U}).Recall that ϕ\phi is bounded by a polynomial function in the variables (s,x,y)(s,x,y). We note that

|ϕ(τn,Sτn,Xτnπ,Y^τn)|C(1+|Sτn|p+|Xτnπ|p+|Y^τn|2)C(1+supt{|St|p+|Xtπ|p+|Y^t|2}).|\phi(\tau_{n},S_{\tau_{n}},X^{\pi^{*}}_{\tau_{n}},\hat{Y}_{\tau_{n}})|\leq C(1+|S_{\tau_{n}}|^{p}+|X^{\pi^{*}}_{\tau_{n}}|^{p}+|\hat{Y}_{\tau_{n}}|^{2})\leq C(1+\sup_{t}\{|S_{t}|^{p}+|X^{\pi^{*}}_{t}|^{p}+|\hat{Y}_{t}|^{2}\}).

By applying the dominated convergence theorem, we deduce that

𝔼[ϕ(T,ST,XTπ,Y^T)+0T(XrπY^r)f(r)𝑑r]ϕ(0,S0,x,0),\mathbb{E}[\phi(T,S_{T},X^{\pi^{*}}_{T},\hat{Y}_{T})+\int_{0}^{T}(X^{\pi^{*}}_{r}-\hat{Y}_{r})f(r)dr]\leq\phi(0,S_{0},x,0),

with equality when (ZX,Z,ΓX,Γ,U)(Z^{X},Z,\Gamma^{X},\Gamma,U) = (Z^X,Z^,Γ^X,Γ^,U^)(\hat{Z}^{X},\hat{Z},\hat{\Gamma}^{X},\hat{\Gamma},\hat{U}).

Viscosity solution and dynamic programming.

Assuming the existence of a 𝒞1,2,2,2\mathcal{C}^{1,2,2,2} solution to (bHJB) can be relaxed in Theorem (2) by showing that the value function of the problem is a solution in the sense of viscosity of the PDE (bHJB). This relaxation of regularity for solution to PDE has been developed by Lions and Crandal in [13, 12]. We refer to [20, 46] for more details about stochastic control with viscosity solutions. We start by recalling the dynamic programming principle and we define the continuation value objective of the principal for any control ν:=(Z,ZX,Γ,ΓX,U)\nu:=(Z,Z^{X},\Gamma,\Gamma^{X},U), by

v(t,s,x,y,ν)=𝔼[tT(Xrt,x,πY^rt,y)f(r)dr],v(t,s,x,y,\nu)=\mathbb{E}\bigl{[}\int_{t}^{T}(X^{t,x,\pi^{*}}_{r}-\hat{Y}^{t,y}_{r})f(r)dr\bigl{]},

so that V(t,s,x,y)=supνv(t,s,x,y,ν),V(t,s,x,y)=\sup_{\nu}v(t,s,x,y,\nu), where Xt,x,πX^{t,x,\pi^{*}} and Y^t,y\hat{Y}^{t,y} denotes the flow processes starting at time tt with respective initial values xx and yy in (SDE). The dynamic programming principle states that for any stopping time θ[t,T]\theta\in[t,T] where t<Tt<T we have

V(t,s,x,y)=sup(Z,ZX,Γ,ΓX,U)𝒰[t,θ]𝔼[V(θ,Sθt,s,Xθt,x,π,Yθt,y)+tθ(Xrt,x,πY^rt,y)f(r)dr].V(t,s,x,y)=\sup_{(Z,Z^{X},\Gamma,\Gamma^{X},U)\in\mathcal{U}_{[t,\theta]}}\mathbb{E}\bigl{[}V(\theta,S_{\theta}^{t,s},X_{\theta}^{t,x,\pi^{*}},Y_{\theta}^{t,y})+\int_{t}^{\theta}(X^{t,x,\pi^{*}}_{r}-\hat{Y}^{t,y}_{r})f(r)dr\bigl{]}. (5)

The notion of weak solutions of (bHJB) results from this dynamic programming principle and is defined as follows.

Definition 4 (Viscosity solution).

We say that vv is a lower (resp. upper) semi-continuous super-solution (resp. sub-solution) of (bHJB) on [0,T)×m××[0,T)\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R} if for all function ϕ𝒞1,2,2,2\phi\in\mathcal{C}^{1,2,2,2} and (t^,s^,x^,y^)[0,T)×m××(\hat{t},\hat{s},\hat{x},\hat{y})\in[0,T)\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R} satisfying

0=min(t,s,x,y)[0,T)×m××(vϕ)(t,s,x,y)=(vϕ)(t^,s^,x^,y^),0=\min_{(t,s,x,y)\in[0,T)\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R}}(v-\phi)(t,s,x,y)=(v-\phi)(\hat{t},\hat{s},\hat{x},\hat{y}),
resp. 0=max(t,s,x,y)[0,T)×m××(vϕ)(t,s,x,y)=(vϕ)(t^,s^,x^,y^),\text{resp. }0=\max_{(t,s,x,y)\in[0,T)\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R}}(v-\phi)(t,s,x,y)=(v-\phi)(\hat{t},\hat{s},\hat{x},\hat{y}),

we have

tv(t^,s^,x^,y^)(x^y^)f(t^)sup(zx,z,gx,g,u)v(t^,s^,x^,y^,v,Δv,zx,z,gx,g,u)0(resp. 0)-\partial_{t}v(\hat{t},\hat{s},\hat{x},\hat{y})-(\hat{x}-\hat{y})f(\hat{t})-\sup_{(z^{x},z,g^{x},g,u)}\mathcal{H}^{v}(\hat{t},\hat{s},\hat{x},\hat{y},\nabla v,\Delta v,z^{x},z,g^{x},g,u)\geq 0\;\text{(resp. }\leq 0\text{)}

If vv is both a super and sub solution, we say that vv is a viscosity solution to (uHJB).

As a consequence of the dynamic programming principle, we have the following theorem

Theorem 3.

Assume that the value function VV is locally bounded on [0,T)×m××[0,T)\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R}. Then VV is a viscosity solution to (bHJB).

Remark 7.

Note that the infinitesimal generator \mathcal{H} contains a degenerative term through the λ\lambda process. This term is exploding when tt approaches TT in the case that the support of τ\tau is bounded and requires to prove Theorem 3 with a localisation technique of both the state variables of the problem and the λ\lambda process.

Proof of Theorem 3.

We denote by V,VV_{*},V^{*} the lower semi-continuous and upper semi-continuous envelopes of VV respectively.

Step 1. Proof of the super-solution property. Let ϕ\phi be a 𝒞1,2,2,2\mathcal{C}^{1,2,2,2} function. Let (t^,s^,x^,y^)[0,T)×m××(\hat{t},\hat{s},\hat{x},\hat{y})\in[0,T)\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R} be such that

0=min(Vϕ)(t,s,x,y)=(Vϕ)(t^,s^,x^,y^)0=\min(V_{*}-\phi)(t,s,x,y)=(V_{*}-\phi)(\hat{t},\hat{s},\hat{x},\hat{y})

and a sequence (tn,sn,xn,yn)[0,T]×m××(t_{n},s_{n},x_{n},y_{n})\in[0,T]\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R} such that

(tn,sn,xn,yn)(t^,s^,x^,y^),(t_{n},s_{n},x_{n},y_{n})\longrightarrow(\hat{t},\hat{s},\hat{x},\hat{y}),

with limV(tn,sn,xn,yn)=V(t^,s^,x^,y^)\lim\limits V(t_{n},s_{n},x_{n},y_{n})=V_{*}(\hat{t},\hat{s},\hat{x},\hat{y}). We define εn:=V(tn,sn,xn,yn)ϕ(tn,sn,xn,yn)0\varepsilon_{n}:=V(t_{n},s_{n},x_{n},y_{n})-\phi(t_{n},s_{n},x_{n},y_{n})\geq 0. Note that εn0\varepsilon_{n}\longrightarrow 0 for large nn. We also define Xn:=Xtn,xn,π,Yn:=Y^tn,ynX^{n}:=X^{t_{n},x_{n},\pi^{*}},Y^{n}:=\hat{Y}^{t_{n},y_{n}} the solution to (SDE) starting at time tnt_{n} with respecting values xnx_{n} and yny_{n} and controlled by α𝒰\alpha\in\mathcal{U} such that α=ν𝒰t^\alpha=\nu\in\mathcal{U}_{\hat{t}}, so that ν\nu is a constant at time t=t^t=\hat{t}, and SnS^{n} denotes the price process starting at the price vector sns_{n} at time tnt_{n}. In other words, Stnn=sn,Xtnn=xnS^{n}_{t_{n}}=s_{n},X^{n}_{t_{n}}=x_{n} and Y^tnn=yn\hat{Y}^{n}_{t_{n}}=y_{n}. We also define

δn:=εn𝟏εn0+1n𝟏εn=0,\delta_{n}:=\sqrt{\varepsilon_{n}}\mathbf{1}_{\varepsilon_{n}\neq 0}+\frac{1}{n}\mathbf{1}_{\varepsilon_{n}=0},

and the stopping time

θn=inf{t>tn,(t,Stn,Xtn,Y^tn,λt)[0,tn+δn)×n×[0,n]},\theta_{n}=\inf\{t>t_{n},\;(t,S^{n}_{t},X^{n}_{t},\hat{Y}^{n}_{t},\lambda_{t})\notin[0,t_{n}+\delta_{n})\times\mathcal{B}_{n}\times[0,n]\},

where n\mathcal{B}_{n} is the unit ball on m××\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R} centered at the point (sn,xn,yn)(s_{n},x_{n},y_{n}). We note that θnt^\theta_{n}\longrightarrow\hat{t} when nn goes to \infty. According to the dynamic programming principle (5) we have

V(tn,sn,xn,yn)𝔼[V(θn,Sθnn,Xθnn,Yθnn)+tnθn(XrnYrn)f(r)dr]V(t_{n},s_{n},x_{n},y_{n})\geq\mathbb{E}\bigl{[}V(\theta_{n},S_{\theta_{n}}^{n},X_{\theta_{n}}^{n},Y_{\theta_{n}}^{n})+\int_{t_{n}}^{\theta_{n}}(X_{r}^{n}-Y^{n}_{r})f(r)dr\bigl{]}

or equivalently

0\displaystyle 0 𝔼[V(tn,sn,xn,yn)V(θn,Sθnn,Xθnn,Yθnn)tnθn(XrnYrn)f(r)dr]\displaystyle\leq\mathbb{E}\bigl{[}V(t_{n},s_{n},x_{n},y_{n})-V(\theta_{n},S_{\theta_{n}}^{n},X_{\theta_{n}}^{n},Y_{\theta_{n}}^{n})-\int_{t_{n}}^{\theta_{n}}(X_{r}^{n}-Y^{n}_{r})f(r)dr\bigl{]}
εn+𝔼[ϕ(tn,sn,xn,yn)ϕ(θn,Sθnn,Xθnn,Yθnn)tnθn(XrnYrn)f(r)dr]\displaystyle\leq\varepsilon_{n}+\mathbb{E}\bigl{[}\phi(t_{n},s_{n},x_{n},y_{n})-\phi(\theta_{n},S_{\theta_{n}}^{n},X_{\theta_{n}}^{n},Y_{\theta_{n}}^{n})-\int_{t_{n}}^{\theta_{n}}(X_{r}^{n}-Y^{n}_{r})f(r)dr\bigl{]}

Applying Ito’s formula for the function ϕ\phi and by the localisation procedure before the stopping time θn\theta_{n} we get

0\displaystyle 0 εn+𝔼[tnθn(tϕ(r,Srn,Xrn,Yrn)ϕ(r,Srn,Xrn,Yrn,ν)(XrnYrn)f(r))dr]\displaystyle\leq\varepsilon_{n}+\mathbb{E}\bigl{[}\int_{t_{n}}^{\theta_{n}}\big{(}-\partial_{t}\phi(r,S_{r}^{n},X_{r}^{n},Y_{r}^{n})-\mathcal{H}^{\phi}(r,S_{r}^{n},X_{r}^{n},Y_{r}^{n},\nu)-(X_{r}^{n}-Y^{n}_{r})f(r)\big{)}dr\bigl{]}
εnδn+𝔼[1δntnθn(tϕ(r,Srn,Xrn,Yrn)ϕ(r,Srn,Xrn,Yrn,ν)(XrnYrn)f(r))dr]\displaystyle\leq\frac{\varepsilon_{n}}{\delta_{n}}+\mathbb{E}\bigl{[}\frac{1}{\delta_{n}}\int_{t_{n}}^{\theta_{n}}\big{(}-\partial_{t}\phi(r,S_{r}^{n},X_{r}^{n},Y_{r}^{n})-\mathcal{H}^{\phi}(r,S_{r}^{n},X_{r}^{n},Y_{r}^{n},\nu)-(X_{r}^{n}-Y^{n}_{r})f(r)\big{)}dr\bigl{]}

Since the function ϕ\phi is assumed to be 𝒞1,2,2,2\mathcal{C}^{1,2,2,2}, ϕ\phi and its derivative are essentially locally bounded around n\mathcal{B}_{n} and uniformly in nn for t<Tt<T. taking the limit when nn goes to \infty, we obtain

tϕ(t^,s^,x^,y^)ϕ(t^,s^,x^,y^,ν)(x^y^)f(t^)0, for any νm××m×.-\partial_{t}\phi(\hat{t},\hat{s},\hat{x},\hat{y})-\mathcal{H}^{\phi}(\hat{t},\hat{s},\hat{x},\hat{y},\nu)-(\hat{x}-\hat{y})f(\hat{t})\geq 0,\text{ for any }\nu\in\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R}^{m}\times\mathbb{R}.

We deduce that VV is a viscosity super-solution in the sense of Definition 4 to (bHJB).

Step 2. Proof of the sub-solution property. Let ϕ\phi be a 𝒞1,2,2,2\mathcal{C}^{1,2,2,2} function. Let (t^,s^,x^,y^)[0,T)×m××(\hat{t},\hat{s},\hat{x},\hat{y})\in[0,T)\times\mathbb{R}^{m}\times\mathbb{R}\times\mathbb{R} be such that

0=max(Vϕ)(t,s,x,y)=(Vϕ)(t^,s^,x^,y^).0=\max(V^{*}-\phi)(t,s,x,y)=(V^{*}-\phi)(\hat{t},\hat{s},\hat{x},\hat{y}).

We assume by contradiction that

tϕ(t^,s^,x^,y^)supνϕ(t^,s^,x^,y^,ν)(x^y^)f(t^)>0-\partial_{t}\phi(\hat{t},\hat{s},\hat{x},\hat{y})-\sup_{\nu}\mathcal{H}^{\phi}(\hat{t},\hat{s},\hat{x},\hat{y},\nu)-(\hat{x}-\hat{y})f(\hat{t})>0

Since the function ϕ\mathcal{H}^{\phi} is continuous, we deduce that there exists a ball ε\mathcal{B}_{\varepsilon} around (t^,s^,x^,y^)(\hat{t},\hat{s},\hat{x},\hat{y}) with norm ε>0\varepsilon>0 small enough such that

tϕ(t,s,x,y)supνϕ(t,s,x,y,ν)(xy)f(t)<0,(t,s,x,y)ε.-\partial_{t}\phi(t,s,x,y)-\sup_{\nu}\mathcal{H}^{\phi}(t,s,x,y,\nu)-(x-y)f(t)<0,\quad(t,s,x,y)\in\mathcal{B}_{\varepsilon}.

Note that there exists η>0\eta>0 independent of ν\nu such that

2η:=max(t,s,x,y)εV(t,s,x,y)ϕ(t,s,x,y).-2\eta:=\max_{(t,s,x,y)\in\mathcal{B}_{\varepsilon}}V^{*}(t,s,x,y)-\phi(t,s,x,y).

Let (tn,sn,xn,yn)ε(t_{n},s_{n},x_{n},y_{n})\in\mathcal{B}_{\varepsilon} converging to (t^,s^,x^,y^)(\hat{t},\hat{s},\hat{x},\hat{y}) such that limV(tn,sn,xn,yn)=V(t^,s^,x^,y^)\lim\limits V(t_{n},s_{n},x_{n},y_{n})=V^{*}(\hat{t},\hat{s},\hat{x},\hat{y}) and η<(Vϕ)(tn,sn,xn,yn)<0-\eta<(V-\phi)(t_{n},s_{n},x_{n},y_{n})<0 for any n1n\geq 1. We define

θn=inf{t>tn,(t,Stn,Xtn,Y^tn,λt)ε×[0,n]}.\theta_{n}=\inf\{t>t_{n},\;(t,S^{n}_{t},X^{n}_{t},\hat{Y}^{n}_{t},\lambda_{t})\notin\mathcal{B}_{\varepsilon}\times[0,n]\}.

Applying Ito’s formula, we get for any control ν\nu

V(tn,sn,xn,yn)\displaystyle V(t_{n},s_{n},x_{n},y_{n}) η+ϕ(tn,sn,xn,yn)\displaystyle\geq-\eta+\phi(t_{n},s_{n},x_{n},y_{n})
=η+𝔼[ϕ(θn,Sθnn,Xθnn,Yθnn)tnθn[tϕ(r,Srn,Xrn,Yrn)+ϕ(r,Srn,Xrn,Yrn,νr)]𝑑r]\displaystyle=-\eta+\mathbb{E}\big{[}\phi(\theta_{n},S_{\theta^{n}}^{n},X_{\theta_{n}}^{n},Y_{\theta^{n}}^{n})-\int_{t_{n}}^{\theta_{n}}[\partial_{t}\phi(r,S_{r}^{n},X_{r}^{n},Y_{r}^{n})+\mathcal{H}^{\phi}(r,S_{r}^{n},X_{r}^{n},Y_{r}^{n},\nu_{r})]dr\big{]}
η+𝔼[ϕ(θn,Sθnn,Xθnn,Yθnn)+tnθn(XrnYrn)f(r)𝑑r]\displaystyle\geq-\eta+\mathbb{E}\big{[}\phi(\theta_{n},S_{\theta^{n}}^{n},X_{\theta_{n}}^{n},Y_{\theta^{n}}^{n})+\int_{t_{n}}^{\theta_{n}}(X_{r}^{n}-Y_{r}^{n})f(r)dr\big{]}
η+𝔼[V(θn,Sθnn,Xθnn,Yθnn)+tnθn(XrnYrn)f(r)𝑑r].\displaystyle\geq\eta+\mathbb{E}\big{[}V^{*}(\theta_{n},S_{\theta^{n}}^{n},X_{\theta_{n}}^{n},Y_{\theta^{n}}^{n})+\int_{t_{n}}^{\theta_{n}}(X_{r}^{n}-Y_{r}^{n})f(r)dr\big{]}.

Since η\eta is independent of the control ν\nu, it contradicts the dynamic programming principle (5). We thus deduce that

tϕ(t^,s^,x^,y^)supνϕ(t^,s^,x^,y^,ν)(x^y^)f(t^)0,-\partial_{t}\phi(\hat{t},\hat{s},\hat{x},\hat{y})-\sup_{\nu}\mathcal{H}^{\phi}(\hat{t},\hat{s},\hat{x},\hat{y},\nu)-(\hat{x}-\hat{y})f(\hat{t})\leq 0,

hence VV is a sub-solution in the sense of in the sense of Definition 4 to (bHJB). ∎

3.2.2 Unbounded default time

Even in the unbounded case, we start by by getting rid of the Y^0\hat{Y}_{0} we have in (4) as it is useless for the optimisation routine: again Y^\hat{Y} and YY only differs for a constant, so they have the same governing SDE. Therefore, our starting point is the following optimisation problem

sup(Z,ZX,Γ,ΓX,U)𝔼[UP(XTτπY^Tτ)]\sup_{(Z,Z^{X},\Gamma,\Gamma^{X},U)}\mathbb{E}[U^{P}(X_{T\wedge\tau}^{\pi^{*}}-\hat{Y}_{T\wedge\tau})]\\

The first difference is that now, in reformulating the problem using the default density, we will still have a terminal part (1Fτ(T))(XTYT)(1-F_{\tau}(T))(X_{T}-Y_{T}). Overall, our problem is now

𝔼[0T(XtπY^t)f(t)dt+(1Fτ(T))(XTY^T)]\mathbb{E}\bigl{[}\int_{0}^{T}(X_{t}^{\pi^{*}}-\hat{Y}_{t})f(t)dt+(1-F_{\tau}(T))(X_{T}-\hat{Y}_{T})\bigl{]}

Note that this problem differs form the bounded default time case only from the terminal condition given by

v(T,x,y)=(1Fτ(T))(xy)v(T,x,y)=(1-F_{\tau}(T))(x-y)

We introduce the following integro-partial PDE and the verification theorem follows.

(uHJB){tv(t,s,x,y)+(xy)f(t)+sup(zx,z,gx,g,u){v(t,s,x,y,v,Δv,zx,z,gx,g,u)}=0,t<Tv(T,s,x,y)=(1Fτ(T))(xy).\textbf{(uHJB)}\quad\begin{cases}\partial_{t}v(t,s,x,y)+(x-y)f(t)+\sup_{(z^{x},z,g^{x},g,u)}\bigl{\{}\mathcal{H}^{v}(t,s,x,y,\nabla v,\Delta v,z^{x},z,g^{x},g,u)\bigl{\}}=0,\;t<T\\ v(T,s,x,y)=(1-F_{\tau}(T))(x-y).\end{cases}
Theorem 4 (Verification Theorem - unbounded case).

Assume that there exists a function ϕ\phi twice continuously differentiable in space and differentiable in time, such that ϕ(t,s,x,y)\phi(t,s,x,y) solves (uHJB). We denote by (Z^t,Z^tX,Γ^t,Γ^tX,U^t)(\hat{Z}_{t},\hat{Z}_{t}^{X},\hat{\Gamma}_{t},\hat{\Gamma}_{t}^{X},\hat{U}_{t}) the optimizers in the supremum. Furthermore, assume that ϕ\phi has a quadratic growth in yy and polynomial growth in s,xs,x such that

|ϕ(t,s,x,y)|κ(1+|x|p+sp+|y|2),p>1,κ>0.|\phi(t,s,x,y)|\leq\kappa(1+|x|^{p}+\|s\|^{p}+|y|^{2}),\;p>1,\;\kappa>0.

Then, for each t[0,T]t\in[0,T], the strategy (Z^X,Z^,Γ^X,Γ^,U^)(\hat{Z}^{X},\hat{Z},\hat{\Gamma}^{X},\hat{\Gamma},\hat{U}) is an optimal strategy for the control problem and

ϕ(0,S0,x,0)=V0=sup(ZX,Z,ΓX,Γ,U)𝔼[XTτπY^Tτ]Y0^.\displaystyle\phi(0,S_{0},x,0)=V_{0}=\sup_{(Z^{X},Z,\Gamma^{X},\Gamma,U)}\mathbb{E}\left[X_{T\wedge\tau}^{\pi^{*}}-\hat{Y}_{T\wedge\tau}\right]-\hat{Y_{0}}.

The optimal contract is given by

ξ\displaystyle\xi^{\star} =Y^0+0Tτi=1mZ^tiStidSti+0TτZ^tX𝑑Xt+0TτU^s𝑑Hs\displaystyle=\hat{Y}_{0}+\int_{0}^{T\wedge\tau}\sum_{i=1}^{m}\frac{\hat{Z}^{i}_{t}}{S^{i}_{t}}dS^{i}_{t}+\int_{0}^{T\wedge\tau}\hat{Z}_{t}^{X}dX_{t}+\int_{0}^{T\wedge\tau}\hat{U}_{s}dH_{s}
+120Tτ(Γ^tX+η(Z^tX)2)dX,Xt0TτF(t,Z^t,Z^tX,Γ^t,Γ^tX,πt)𝑑t\displaystyle+\frac{1}{2}\int_{0}^{T\wedge\tau}(\hat{\Gamma}_{t}^{X}+\eta(\hat{Z}_{t}^{X})^{2})d\langle X,X\rangle_{t}-\int_{0}^{T\wedge\tau}F(t,\hat{Z}_{t},\hat{Z}_{t}^{X},\hat{\Gamma}_{t},\hat{\Gamma}_{t}^{X},\pi^{*}_{t})dt
+0Tτi=1mΓ^siStidSi,Xt\displaystyle+\int_{0}^{T\wedge\tau}\sum_{i=1}^{m}\frac{\hat{\Gamma}^{i}_{s}}{S^{i}_{t}}d\langle S^{i},X\rangle_{t}
Remark 8.

The proof of this theorem follows the same lines than the proof of Theorem 2 without requiring to localize the λ\lambda term.

Remark 9.

Similarly to Theorem 3, the value function of the problem is a viscosity solution to (uHJB).

4 Numerical solutions

In this section, we explore the numerical analysis of the HJB equations (bHJB) and (uHJB). On the first level it is mainly to understand the differences in terms of bounded/unbounded decision and to check them against the non default case. On the second level, we are interested, for the bounded default, to capture some features:

  • understand how the default time affects the trading strategy and the incentive compensation scheme;

  • investigate how the skewness of the default with bounded support in [0,T][0,T] impacts the compensation scheme;

  • compare the compensation scheme, especially the compensation with respect to the default given by the process UU, in the linear case Ξl\Xi^{l} and the general case Ξ\Xi;

  • give a qualitatively explanation of the average patterns of various incentives related to the different kind of risks involved in the compensation scheme.

The problem already poses several numerical challenges, especially because we have no explicit formula for the optimizer U^\hat{U} in Theorem 2 and the high dimensionality of the state variable set. Both (bHJB) and (uHJB) require an iterative approach, as one of the coefficient involves the value of the solution in a point which is different from the one we are currently evaluating. In recent years literature, neural network showed great potential to be the state of the arts for numerical scheme for PDE, so the backbone of our algorithm will be a simple feed-forward neural net, that we will deploy in a deep learning scheme like the one used in [3], where an actor-critic approach is used to find the optimal policy for market making.

We recall that the PDE we are trying to solve is

tv(t,s,x,y)+(xy)f(t)+sup(zx,z,gx,g,u){v(t,s,x,y,v,Δv,zx,z,gx,g,u)}=0,t<T\partial_{t}v(t,s,x,y)+(x-y)f(t)+\sup_{(z^{x},z,g^{x},g,u)}\bigl{\{}\mathcal{H}^{v}(t,s,x,y,\nabla v,\Delta v,z^{x},z,g^{x},g,u)\bigl{\}}=0,\;t<T

coupled with some boundary and terminal conditions dependent on which case we are considering. The simple idea of our algorithm is to have an initial guess for the optimizers

(z0x,z0,g0x,g0,u0)({z_{0}^{x}}^{*},z^{*}_{0},{g_{0}^{x}}^{*},g^{*}_{0},u^{*}_{0})

and an initial untrained neural net which is serving as initial value function v0v_{0}. For this fixed set of values, we are going to find the solution v1v_{1} of the above PDE by training the neural network. The training of a neural network for the so-called Physics-Informed-Neural-Network consist in using the auto-differentiation tools of deep learning packages to have the right derivatives, to build a loss function which is the residual of the PDE equation, and to have as additional loss boundary/terminal/initial conditions. A very detailed description can be found in the seminal work of [42]. The advantage of using a neural network is that v1v_{1} is a parametrized solution and we can numerically try to solve again the optimisation problem

sup(zx,z,gx,g,u){v(t,s,x,y,v,Δv,zx,z,gx,g,u)}\sup_{(z^{x},z,g^{x},g,u)}\bigl{\{}\mathcal{H}^{v}(t,s,x,y,\nabla v,\Delta v,z^{x},z,g^{x},g,u)\bigl{\}}

finding new optimal variables denoted by (z1x,z1,g1x,g1,u1)({z_{1}^{x}}^{*},z^{*}_{1},{g_{1}^{x}}^{*},g^{*}_{1},u^{*}_{1}) and solving the new PDE related to this new parameters by iteration to reach convergence to the solution of the PDE. The pseudo-code can be found in the following listing Algorithm 1 and Algorithm 2.

Data: v0v_{0}: Initial guess for the PDE solution
(z0x,z0,g0x,g0,u0)({z_{0}^{x}}^{*},z_{0}^{*},{g_{0}^{x}}^{*},g^{*}_{0},u^{*}_{0}): Initial maximization variables
Sampler: sampling of the variables
PDE inputs: parameters of the PDE
Tolerance ϵ\epsilon, max_iter: Maximum iterations
k0k\leftarrow 0: Iteration counter
Result: PDE solution vv^{*}
while kk\leq max_iter and not converged do
       /* Sample new points */
       sks_{k}\leftarrow sampler()();
      
      /* Solve the PDE with current guess */
       vk+1v_{k+1}\leftarrow SolvePDENN(sk,zkx,zk,gkx,gk,uk,vk)(s_{k},{z_{k}^{x}}^{*},z^{*}_{k},{g_{k}^{x}}^{*},g^{*}_{k},u^{*}_{k},v_{k});
      
      /* Maximize over variables */
       zk+1x,zk+1,gk+1x,gk+1,uk+1{z_{k+1}^{x}}^{*},z^{*}_{k+1},{g_{k+1}^{x}}^{*},g^{*}_{k+1},u^{*}_{k+1}\leftarrow SolveMax(vk+1)(v_{k+1});
      
      /* Check for convergence */
       if |vk+1vk|ϵ|v_{k+1}-v_{k}|\leq\epsilon then
             /* If converged, exit loop */
             break;
            
      
      /* Increment iteration counter */
       kk+1k\leftarrow k+1;
      
Algorithm 1 PDE-Maximizer Structure
Function SolvePDENN(sk,zkx,zk,gkx,gk,uk,vks_{k},{z_{k}^{x}}^{*},z^{*}_{k},{g_{k}^{x}}^{*},g^{*}_{k},u^{*}_{k},v_{k}):
       /* Step 1: Evaluate the PDE residual at the sampled points sks_{k} using current variables, and evaluate boundary and terminal loss */
       Compute PDE residual at sks_{k} using vk,zkx,zk,gkx,gk,ukv_{k},{z_{k}^{x}}^{*},z^{*}_{k},{g_{k}^{x}}^{*},g^{*}_{k},u^{*}_{k};
      
      /* Step 2: One step of stochastic gradient descent for the neural network-based approximation of the PDE */
       Use a neural network or numerical solver to minimize the residual and update vk+1v_{k+1};
      
      /* Step 3: Return the updated PDE solution */
       return vk+1v_{k+1};
      
Function SolveMax(vk+1v_{k+1}):
       /* Step 1: Compute the current approximation of the value function vk+1v_{k+1} */
       Plug in vk+1v_{k+1} into the function to be maximised;
      
      /* Step 2: Maximize over the variables */
       Solve for zk+1x,zk+1,gk+1x,gk+1,uk+1{z_{k+1}^{x}}^{*},z^{*}_{k+1},{g_{k+1}^{x}}^{*},g^{*}_{k+1},u^{*}_{k+1} that maximize the objective function (grid approach);
      
      /* Step 3: Return the maximized variables */
       return zk+1x,zk+1,gk+1x,gk+1,uk+1{z_{k+1}^{x}}^{*},z^{*}_{k+1},{g_{k+1}^{x}}^{*},g^{*}_{k+1},u^{*}_{k+1};
      
Algorithm 2 Subroutines: SolvePDENN and SolveMax
Main numerical challenges and remedies.

Several numerical challenges have been are derived from the training phase and the solution of (bHJB) or (uHJB). First, the neural network can go towards a constant solution because residual can be kept at a low value and boundary/terminal conditions are met. To avoid the solution getting stuck into a constant, we implemented a regularisation terms penalising the first derivatives being too close to 0. This terms progressively cancels during training once sufficient training is done. The second challenge is to select optimally the weight of the network regarding the boundary conditions. We are using scaling weights for the terminal/boundary loss at the beginning of training so that the loss focus is on the PDE residual. These weights go back to 11 once sufficient training is done. Finally, note that the convergence of this algorithm is especially difficult to study especially in the first few iterations where the loss can potentially explode, due to the exponential term and the degenerated case with exploding term ΛT=\Lambda_{T}=\infty involved in (bHJB). To tackle the iterative challenge of the problem, initialisation of the optimizers were uniformly distributed on large intervals so that movements were not biased in any way. Moreover, instead of taking the current maximum, a weighted average was taken to mollify the optimization process, with weight decaying so that after a lot of training the new optimal solution is taken with weight 0.990.99.

In practice, we used a simple grid search algorithm for maximising over the variables (u,zx,z,ΓX,Γ)u,z^{x},z,\Gamma^{X},\Gamma) and some more efficient approaches can be tried. From our expertise on the problem, a Neural Network approach, making the whole algorithm like a pure deep learning actor-critic was not suited best, as the maximisation task is tough to deal with for early training and was giving raise to some non smooth result that were making the optimisation problem much tougher. Our solution, coupled with an interpolation mechanism in style of KK-neighbours was working fairly well in practice for our purposes. For the sake of simplicity m=d=1m=d=1 for the numerical simulations. Note however that our algorithm is tractable to higher dimensionality and still require the algorithm developed above due to the non existence of explicit formula for U^\hat{U} to solve the integro-partial PDE (bHJB) or (uHJB).

4.1 Numerical simulations with bounded default times

In this subsection we will discuss the experiments with a bounded default time. In particular, we deployed a simple feed-forward NN, with 3d inputs, 1d outputs, 8 hidden layers each with 32 neurons. The learning rate was piece-wise linear, we were sampling 50005000 points uniformly and 400400 between terminal and boundary points. In order to capture the effect of skewness, we decided to have a default time of the beta family, with varying parameters to capture different skewness levels. We worked with Beta(2,4)(2,4),Beta(1,1)(1,1) (i.e a uniform distribution), and a Beta(4,2)(4,2). Of course, the exploding compensator λ\lambda was computed but capped, and since the intervals of our study was from (0,10)(0,10) in terms of wealth, we opted to cap the compensator process to 1000010000. The market has a positive drift of 10%10\% with a volatility of 30%30\%. In particular, we will show as an example the calculations for the symmetric beta random variable, which is, in fact, a uniform distribution. Consider for example τ\tau being uniformly distributed on [0,1][0,1]. We recall that

(τ>x|t)=x+γ(t,u)𝑑u\mathbb{P}(\tau>x|\mathcal{F}_{t})=\int_{x}^{+\infty}\gamma(t,u)du

Since we assume independence of τ\tau from the filtration, we have

(τ>x|t)=(τ>x)=1x\mathbb{P}(\tau>x|\mathcal{F}_{t})=\mathbb{P}(\tau>x)=1-x

as τU(0,1)\tau\sim U(0,1). Consequently,

γ(t,u)=𝟏1u,λt=γ(t,t)(τ>t|t)=11t.\gamma(t,u)=\mathbf{1}_{1\geq u},\;\lambda_{t}=\frac{\gamma(t,t)}{\mathbb{P}(\tau>t|\mathcal{F}_{t})}=\frac{1}{1-t}.

Using [18, Proposition 4.4], by defining

Λt=0tλs𝑑s\Lambda_{t}=\int_{0}^{t}\lambda_{s}ds

we have

(τ>x|t)=exp(Λt)\mathbb{P}(\tau>x|\mathcal{F}_{t})=\exp{(-\Lambda_{t})}
Λt=0tλs𝑑s=0t11s𝑑s=ln(1t)\displaystyle\Lambda_{t}=\int_{0}^{t}\lambda_{s}ds=\int_{0}^{t}\frac{1}{1-s}ds=-\ln{(1-t)}

so that

exp(Λt)=exp(ln(1t))=1t,\exp{(-\Lambda_{t})}=\exp{(\ln{(1-t)})}=1-t,

matching with the survival function of the uniform distribution. In all the experiments with a independent default time, τ\tau has a probability density function fτf_{\tau} and an associated cumulative distribution function FτF_{\tau} so that

λt=fτ(t)1Fτ(t)\lambda_{t}=\frac{f_{\tau}(t)}{1-F_{\tau}(t)}
Refer to caption
(a) Wealth model
Refer to caption
(b) Strategy model
Figure 1: Wealth and strategies for our models and the case without any default.

Figure 1 give a comparative study of the average portfolio evolution and average optimal strategy by considering a uniform distributions on [0,1][0,1] or Beta distributions with opposite skew on [0,1][0,1], and the case without any jump τ=+\tau=+\infty. Since at the default time, the trader exists the market which has a positive drift, the total terminal wealth of the non-default case is bigger, and in general, the three models in which the default happens on average first are the least profitable one as expected. Note that the wealth obtained by considering a Beta(4,2)(4,2) distribution is higher than the uniform which is higher than the Beta(2,4)(2,4) distribution. If the default is more likely to happen at the end, the trader is expected to be less at risk of default at the beginning and so benefits from better decisions and more time to invest the money in the assets. The impact of the default on the behaviour of the investment strategy is quite different and reflect interesting features. If at the beginning all the 4 strategies mirror each other, knowing that no default is going to happen makes the no-default investor more careful, while in all 3 cases of default, the strategy goes flat to a certain value. The default case make the investor more aggressive in the trading compared with the non-default case.

Refer to caption
(a) Compensation model
Refer to caption
(b) Model uu
Refer to caption
(c) Model zxz_{x}
Refer to caption
(d) Model gxg_{x}
Figure 2: Compensation and optimal incentives evolution in different models.

Figure 2 gives the average total compensation YtY_{t} evolution (top left), average compensation with respect to the default time UU (top right), average incentive with respect to the wealth of the portfolio ZXZ^{X} (bottom left) and average compensation with respect to the variability (quadratic variation) of the portfolio value ΓX\Gamma^{X} (bottom right). We note that for a skew on the right for the default time distribution, the total compensation YtY_{t} is reduced. The average compensation is also increasing along the time. Intuitively, the investor needs to pay more the broker to make the investments when she knows that the conditional probability of the default happening is greater.

Regarding the the incentive related to the default UU, the uniform default distributes the incentive uniformly. The left skew distribution, Beta24 shows that this compensation is increasing with time, since the default is even more likely to happen for small values of time. The right skewed beta, Beta42 shows an interesting bimodal phases: at the beginning both the principal and the agent know in advance the default is going to happen at the end and the first local maximum of the green curve can be seen as a preventive compensation for the future default, and therefore there is no “wealth shock” at the anticipated default. Then, the compensation stays low until approaching the end when the default is more likely to happen.

The incentive ZXZ^{X} related to the portfolio performance is growing along time, responding to the growth of the portfolio wealth. Note that the compensation ZXZ^{X} for the uniform distribution has the least growth, while a skew on the left provides a higher compensation.

The compensation ΓX\Gamma^{X} with respect to the variability of the wealth is almost constant for the uniform distribution, while decreasing then increasing for a skewed distribution. Finally, it is worth mentioning that, for the plots of ΓX\Gamma^{X}, the values are, in average, such that the matrix QQ is positive definite and so meet the requirement for the optimal contract to be admissible, and so that the whole numerical scheme is well-posed.

Figure 3 investigates the difference between the solution of the problem for contracts restricted in the set of linear contracts in Ξl\Xi^{l} and the set of general contract Ξ\Xi. The optimality gap in the portfolio performance is small but noticeable (top left), and the linear strategy is more conservative and shifted down after half the interval but closely mirroring the more general contract (top right). Since the contract is not optimal, the compensation to hedge against the default is very different, growing in time but still being small with respect to other variables (bottom center).

Refer to caption
(a) Linear wealth model
Refer to caption
(b) Linear strategy model
Refer to caption
(c) Average incentive
Figure 3: Wealth, strategy, and average incentive with respect to the default under uniform default between the more general contract and the linear one.

4.2 Numerical simulations with unbounded default times and comparisons

We turn to a comparative study of both the trading strategies and incentives when the default is unbounded or bounded on [0,1][0,1]. In this section, we opted for a exponential default time for the unbounded case, with parameter λ=2\lambda=2, and we compared against the uniform distribution, as the mean was the same.

Figure 4 presents the different portfolio evolution, optimal strategy, incentives with respect to the jump UU and average incentive with respect to the portfolio value in all the three typical case: uniform distribution for the default, exponential distribution for the default or no default time. We see the effect of reaching the pre-planned investment horizon, as the strategy for the unbounded case is reaching, on average, a larger terminal wealth. Note that the trading strategy for the exponential case emphasizes a in-between behavior, between the non default and the uniform default cases: it is similar to the uniform case at the beginning, but the strategy gets closer to the non default case when the end of horizon is approaching. This can be intuitively explained as there is no certainty that the default is going to happen and therefore the strategy does not have to stay as aggressive as the uniform case.

Furthermore, and perhaps the most interesting takeaway from this comparison, as evident from Figure 4, the incentive for the default changes quite dramatically from the uniform case: this is because now there is uncertainty on the fact default is actually going to happen and the incentive for the risk is not intrinsically present in the contract, but the principal has to reward the agent for entering the market under default threat. In general therefore, we notice that the various incentives are different from the bounded cases, highlighting the difference between the two cases and the impact the certainty of having a default has in structuring the contract.

Refer to caption
(a) Wealth (Exponential Default)
Refer to caption
(b) Strategy (Exponential Default)
Refer to caption
(c) Average Incentive (Exponential Default)
Refer to caption
(d) Portfolio Evolution (Exponential Default)
Figure 4: Comparison of no default, uniform default, and exponential default for wealth, trading strategy, average incentive with respect to the default, and portfolio evolution.

References

  • [1] Anna Aksamit, Monique Jeanblanc, et al. Enlargement of filtration with finance in view. 2017.
  • [2] Robert Almgren and Neil Chriss. Optimal execution of portfolio transactions. Journal of Risk, 3:5–40, 2001.
  • [3] Bastien Baldacci, Iuliia Manziuk, Thibaut Mastrolia, and Mathieu Rosenbaum. Market making and incentives design in the presence of a dark pool: a deep reinforcement learning approach. arXiv preprint arXiv:1912.01129, 2019.
  • [4] Bastien Baldacci and Dylan Possamaï. Governmental incentives for green bonds investment. Mathematics and Financial Economics, 16(3):539–585, 2022.
  • [5] Guy Barles, Rainer Buckdahn, and Etienne Pardoux. Backward stochastic differential equations and integral-partial differential equations. Stochastics: An International Journal of Probability and Stochastic Processes, 60(1-2):57–83, 1997.
  • [6] Atilim Gunes Baydin, Barak A Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. Automatic differentiation in machine learning: a survey. Journal of machine learning research, 18(153):1–43, 2018.
  • [7] Peter L Bernstein and Aswath Damodaran. Investment management. J. Wiley, 1998.
  • [8] Tomasz R Bielecki and Marek Rutkowski. Credit risk: modeling, valuation and hedging. Springer Science & Business Media, 2013.
  • [9] Christophette Blanchet-Scalliet, Nicole El Karoui, Monique Jeanblanc, and Lionel Martellini. Optimal investment decisions when time-horizon is uncertain. Journal of Mathematical Economics, 44(11):1100–1113, 2008.
  • [10] Pierre Brémaud and Marc Yor. Changes of filtrations and of probability measures. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 45(4):269–295, 1978.
  • [11] Alessandro Chiusolo and Emma Hubert. A new approach to principal-agent problems with volatility control. arXiv preprint arXiv:2407.09471, 2024.
  • [12] Michael G Crandall, Hitoshi Ishii, and Pierre-Louis Lions. User’s guide to viscosity solutions of second order partial differential equations. Bulletin of the American mathematical society, 27(1):1–67, 1992.
  • [13] Michael G Crandall and Pierre-Louis Lions. Viscosity solutions of hamilton-jacobi equations. Transactions of the American mathematical society, 277(1):1–42, 1983.
  • [14] Jakša Cvitanić, Dylan Possamaï, and Nizar Touzi. Moral hazard in dynamic risk management. Management Science, 63(10):3328–3346, 2017.
  • [15] Jakša Cvitanić, Dylan Possamaï, and Nizar Touzi. Dynamic programming approach to principal–agent problems. Finance and Stochastics, 22:1–37, 2018.
  • [16] Flávia Zóboli Dalmácio, Valcemiro Nossa, et al. The agency theory applied to the investment funds. Brazilian Business Review, 1(1):31–44, 2004.
  • [17] Peter M DeMarzo and Yuliy Sannikov. Optimal security design and dynamic capital structure in a continuous-time agency model. The journal of Finance, 61(6):2681–2724, 2006.
  • [18] Nicole El Karoui, Monique Jeanblanc, and Ying Jiao. What happens after a default: the conditional density approach. Stochastic processes and their applications, 120(7):1011–1032, 2010.
  • [19] Nicole El Karoui, Shige Peng, and Marie Claire Quenez. Backward stochastic differential equations in finance. Mathematical finance, 7(1):1–71, 1997.
  • [20] Wendell H Fleming and Halil Mete Soner. Controlled Markov processes and viscosity solutions, volume 25. Springer Science & Business Media, 2006.
  • [21] Xin Guo and Yan Zeng. Intensity process and compensator: A new filtration expansion approach and the jeulin–yor theorem. The Annals of Applied Probability, 18(1), 2008.
  • [22] Bengt Holmstrom and Paul Milgrom. Aggregation and linearity in the provision of intertemporal incentives. Econometrica: Journal of the Econometric Society, pages 303–328, 1987.
  • [23] Ying Hu, Peter Imkeller, and Matthias Müller. Utility maximization in incomplete markets. The Annals of Applied Probability, pages 1691 – 1712, 2005.
  • [24] Nobuyuki Ikeda and Shinzo Watanabe. Stochastic differential equations and diffusion processes. Elsevier, 2014.
  • [25] Monique Jeanblanc, Thibaut Mastrolia, Dylan Possamaï, and Anthony Réveillac. Utility maximization with random horizon: a bsde approach. International Journal of Theoretical and Applied Finance, 18(07):1550045, 2015.
  • [26] Monique Jeanblanc and Anthony Réveillac. A note on bsdes with singular driver coefficients. In Arbitrage, credit and informational risks, pages 207–224. World Scientific, 2014.
  • [27] Monique Jeanblanc, Marc Yor, and Marc Chesney. Mathematical methods for financial markets. Springer Science & Business Media, 2009.
  • [28] Michael C Jensen. The performance of mutual funds in the period 1945-1964. The Journal of finance, 23(2):389–416, 1968.
  • [29] Idris Kharroubi, Thomas Lim, and Armand Ngoupeyou. Mean-variance hedging on uncertain time horizon in a market with a jump. Applied Mathematics & Optimization, 68:413–444, 2013.
  • [30] Andrei Kirilenko, Albert S Kyle, Mehrdad Samadi, and Tugkan Tuzun. The flash crash: High-frequency trading in an electronic market. The Journal of Finance, 72(3):967–998, 2017.
  • [31] Magdalena Kobylanski. Backward stochastic differential equations and partial differential equations with quadratic growth. The annals of probability, 28(2):558–602, 2000.
  • [32] Raymond CW Leung. Continuous-time principal-agent problem with drift and stochastic volatility control: with applications to delegated portfolio management. Available at SSRN, 2014.
  • [33] C Wei Li and Ashish Tiwari. Incentive contracts in delegated portfolio management. The Review of Financial Studies, 22(11):4681–4714, 2009.
  • [34] Harry M Markowitz. Foundations of portfolio theory. The journal of finance, 46(2):469–477, 1991.
  • [35] Marie-Amelie Morlais. Utility maximization in a jump market model. Stochastics: An International Journal of Probability and Stochastics Processes, 81(1):1–27, 2009.
  • [36] Hui Ou-Yang. Optimal contracts in a continuous-time delegated portfolio management problem. The Review of Financial Studies, 16(1):173–208, 2003.
  • [37] Antonis Papapantoleon, Dylan Possamaï, and Alexandros Saplaouras. Existence and uniqueness results for bsde with jumps: the whole nine yards. Electronic Journal of Probability, 23:1 – 68, 2018.
  • [38] Etienne Pardoux and Shige Peng. Adapted solution of a backward stochastic differential equation. Systems & control letters, 14(1):55–61, 1990.
  • [39] Etienne Pardoux and Shige Peng. Backward stochastic differential equations and quasilinear parabolic partial differential equations. In Stochastic Partial Differential Equations and Their Applications: Proceedings of IFIP WG 7/1 International Conference University of North Carolina at Charlotte, NC June 6–8, 1991, pages 200–217. Springer, 2005.
  • [40] Elisabeth Paté-Cornell. On “black swans” and “perfect storms”: Risk analysis and management when statistics are not enough. Risk Analysis: An International Journal, 32(11):1823–1833, 2012.
  • [41] Nicolas Privault. Introduction to stochastic finance with market examples. Chapman and Hall/CRC, 2022.
  • [42] Maziar Raissi, Paris Perdikaris, and George Em Karniadakis. Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations. arXiv preprint arXiv:1711.10561, 2017.
  • [43] Yuliy Sannikov. A continuous-time version of the principal-agent problem. The Review of Economic Studies, 75(3):957–984, 2008.
  • [44] Justin Sirignano and Konstantinos Spiliopoulos. Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics, 375:1339–1364, 2018.
  • [45] Livio Stracca. Delegated portfolio management: A survey of the theoretical literature. Journal of Economic surveys, 20(5):823–848, 2006.
  • [46] Nizar Touzi. Optimal stochastic control, stochastic target problems, and backward SDE, volume 29. Springer Science & Business Media, 2012.