This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Trading under the Proof-of-Stake Protocol
– a Continuous-Time Control Approach

Wenpin Tang Department of Industrial Engineering and Operations Research, Columbia University. [email protected]  and  David D. Yao Department of Industrial Engineer and Operations Research, Columbia University. [email protected]
Abstract.

We develop a continuous-time control approach to optimal trading in a Proof-of-Stake (PoS) blockchain, formulated as a consumption-investment problem that aims to strike the optimal balance between a participant’s (or agent’s) utility from holding/trading stakes and utility from consumption. We present solutions via dynamic programming and the Hamilton-Jacobi-Bellman (HJB) equations. When the utility functions are linear or convex, we derive close-form solutions and show that the bang-bang strategy is optimal (i.e., always buy or sell at full capacity). Furthermore, we bring out the explicit connection between the rate of return in trading/holding stakes and the participant’s risk-adjusted valuation of the stakes. In particular, we show when a participant is risk-neutral or risk-seeking, corresponding to the risk-adjusted valuation being a martingale or a sub-martingale, the optimal strategy must be to either buy all the time, sell all the time, or first buy then sell, and with both buying and selling executed at full capacity. We also propose a risk-control version of the consumption-investment problem; and for a special case, the “stake-parity” problem, we show a mean-reverting strategy is optimal.

Key words: Consumption-investment, Proof of Stake (PoS) protocol, cryptocurrency, dynamic programming, HJB equations, continuous-time control, risk control.

1. Introduction

As a digital exchange vehicle, blockchain technology has been successfully deployed in many applications including cryptocurrency [18], healthcare [9], supply chain [8], electoral voting [27], and non-fungible tokens [26]. A blockchain is a growing chain of accounting records, called blocks, which are jointly maintained by participants of the system using cryptography. Consider for instance Bitcoin – a peer to peer decentralized payment system. In contrast to traditional payment processing networks, Bitcoin provides a permissionless environment in which everyone is free to participate. At the core of Bitcoin is the consensus protocol known as Proof of Work (PoW), in which “miners” compete with each other by solving a hashing puzzle so as to validate an ever-growing log of transactions (the “longest chain”) to update a distributed ledger; and the miner who solves the puzzle first receives a reward (a number of coins). Thus, while the competition is open to all participants, the chance of winning is proportional to a miner’s computing power.

Despite its popularity, the PoW protocol has some obvious drawbacks. Competition among miners has led to exploding levels of energy consumption in Bitcoin mining, [17, 20]. [1, 3, 7] pointed out that PoW mining will lead to centralization, violating the core tenet of decentralization. To solve the problem of energy efficiency, [14, 28] introduced another consensus protocol – Proof of Stake (PoS), which is a bidding mechanism to select a miner to validate the new block. Participants who choose to join the bidding process are required to commit certain stakes (coins they own), and the winning probability is proportional to the stakes committed. Hence, a participant in a PoS blockchain is a “bidder”, and only the winning bidder becomes the miner who does the validation. As yet the PoS protocol has not been as popular as PoW. However, it is catching up quickly, and blockchain developers have strong incentives to switch from a PoW to a PoS ecosystem. A prominent case in this direction is Ethereum 2.0, where two parallel chains – Mainnet (PoW) and Beacon Chain (PoS) are expected soon to merge into one unified PoS blockchain [10].

There has been an active stream of recent studies on PoS in the research literature; and here we briefly mention several that relate closely to our study. In [22] it is shown that the PoS protocol is “without waste” from an economic standpoint. Issues of stability and decentralization of the PoS protocol are examined in [21, 24]. Specifically, it is shown in [21] that for large owners of initial wealth in a PoS system their shares of the total wealth will remain stable in the long run (i.e., proportions to the total wealth will remain constant), and hence the rich-get-richer phenomenon will not happen. [24] further extends this to medium and small participants, and reveals a phase transition in share stability among those different types of participants. In [21, 25], various aspects of the consumption-investment problem in PoS are examined, and certain conditions are identified under which a participant may have no incentive to trade with others. This leads to the complementing question, given a participant does prefer to trade, what is the optimal trading strategy?

Motivated by the above question, the objective of our study here is to develop a continuous-time control approach to optimal trading in a PoS blockchain. While the control (or game) approach has been proposed in previous studies [4, 5, 15], they are all for the PoW protocol. To the best of our knowledge, ours is the first control model developed for optimal trading under the PoS protocol.

Here is an overview of our main results. We first formulate the consumption-investment problem, which aims to strike a balance between a participant’s utility from holding/trading stakes and utility from consumption. It takes the form of a deterministic control problem with the real-time trading strategy being the control variable. We start with a detailed analysis on a special case that we call the “stake-hoarding” problem (Proposition 3.1), where we bring out the possible scenario of monopoly. We then solve the general consumption-investment problem via dynamic programming and the Hamilton-Jacobi-Bellman (HJB) equations (Theorem 3.4).

When the utility functions are linear or convex, more explicit solutions can be obtained, and we show that the bang-bang control is optimal, i.e., always buy or sell at full capacity (Propositions 4.1 and 4.3). Along with the optimal trading strategy, we are also able to bring out the explicit connection between the rate of return in trading/holding stakes and the participant’s risk-adjusted valuation of the stakes. In other words, the participant’s risk sensitivity is explicitly accounted for in the trading strategy. In particular, when a participant is risk-neutral or risk-seeking, corresponding to the risk-adjusted valuation being a martingale or a sub-martingale, the optimal strategy must be either buy all the time, sell all the time, or first buy then sell (with both buying and selling executed at full capacity).

Finally, we propose a risk control version of the consumption-investment problem, by adding a penalty term to control the level of stake holding so as to reduce the level of concentration risk (Theorem 5.1). A special case is a “stake-parity” problem, where the participant’s holding is controlled at a level that tries to track the system-wide average. We show that the “mean-reverting” strategy is the optimal solution to the stake-parity problem (Proposition 5.2).

The rest of the paper is organized as follows. Section 2 details the formulation of the consumption-investment problem under the PoS protocol. Section 3 presents the optimal solution to the problem, and Section 4 focuses on the special case of linear and convex utility functions. Section 5 presents extensions to risk-control objectives. Concluding remarks are summarized in Section 6.

2. Model Formulation

This section introduces the problem of trading under the PoS protocol in continuous time, and formulate a control model to solve the problem. First, collected below are some conventions that will be used throughout this paper.

  • \mathbb{R} denotes the set of real numbers, and +\mathbb{R}_{+} denotes the set of nonnegative real numbers.

  • For x,yx,y\in\mathbb{R}, xyx\wedge y denotes the smaller number of xx and yy; xyx\vee y denotes the larger number of xx and yy.

  • The symbol x=o(y)x=o(y) means xy\frac{x}{y} decays towards zero as yy\to\infty.

  • For a random variable XX, 𝔼(X)\mathbb{E}(X) denotes the expectation of XX.

  • Let Ω\Omega be a subset of \mathbb{R}. A function f𝒞k(Ω)f\in\mathcal{C}^{k}(\Omega) if it is kk-time continuously differentiable in Ω\Omega.

  • For f𝒞1([0,T])f\in\mathcal{C}^{1}([0,T]), f(t)f^{\prime}(t) denotes the derivative of ff. For f𝒞1([0,T]×Ω)f\in\mathcal{C}^{1}([0,T]\times\Omega), tf\partial_{t}f (resp. xf\partial_{x}f) denotes the partial derivative of ff with respect to tt (resp. xx).

Time is continuous, indexed by t[0,T]t\in[0,T], for a fixed T>0T>0 representing the length of a finite horizon. Let {N(t), 0tT}\{N(t),\,0\leq t\leq T\} (with N(0):=NN(0):=N) denote the process of the total volume of stakes, which are issued over time by the PoS protocol, and can either be deterministic or stochastic. For ease of presentation, we consider a deterministic process N(t)N(t), which is increasing in time and sufficiently smooth, with the derivative N(t)N^{\prime}(t) representing the instantaneous rate of “reward” — additional stakes (or “coins”) injected into the system specified (exogenously) by the PoS protocol. For instance, we will consider below, as a special case, the process N(t)N(t) of a polynomial form:

Nα(t)=(N1α+t)α,t0.N_{\alpha}(t)=(N^{\frac{1}{\alpha}}+t)^{\alpha},\qquad t\geq 0. (2.1)

Then, Nα(t)=α(N1α+t)α1N^{\prime}_{\alpha}(t)=\alpha(N^{\frac{1}{\alpha}}+t)^{\alpha-1}, and Nα′′(t)=α(α1)(N1α+t)α2N^{\prime\prime}_{\alpha}(t)=\alpha(\alpha-1)(N^{\frac{1}{\alpha}}+t)^{\alpha-2}, so the parametric family (2.1) covers different rewarding schemes according to the values of α\alpha.

  • For 0<α<10<\alpha<1, we have Nα′′(t)<0N^{\prime\prime}_{\alpha}(t)<0 so the process Nα(t)N_{\alpha}(t) corresponds to a decreasing reward (e.g. Bitcoin);

  • For α=1\alpha=1, the process N1(t)=N+tN_{1}(t)=N+t gives a rate one constant reward (e.g. Blackcoin);

  • For α>1\alpha>1, we get Nα′′(t)>0N^{\prime\prime}_{\alpha}(t)>0 and hence, the process Nα(t)N_{\alpha}(t) amounts to an increasing reward (e.g. EOS).

Let K2K\geq 2 denote the total number of participants in the system, who are indexed by k[K]:={1,,K}k\in[K]:=\{1,\ldots,K\}. For each participant kk, let {Xk(t), 0tT}\{X_{k}(t),\,0\leq t\leq T\} (with Xk(0)=xkX_{k}(0)=x_{k}) denote the process of the number of stakes that participant kk holds, with Xk(t)0X_{k}(t)\geq 0 and k=1KXk(t)=N(t)\sum_{k=1}^{K}X_{k}(t)=N(t) for all t[0,T]t\in[0,T]. In the (discrete-time) PoS protocol, in each round of the bidding process, individual participants commit stakes so as to be selected to validate the block and receive a reward; and the winning probability is Xk(t)/N(t)X_{k}(t)/N(t) for participant kk, i.e., proportional to the number of stakes committed. (For instance, each round in Ethereum takes about 1010 seconds, corresponding to the block-generation time [6].) For our continuous-time PoS model here, in which the time required for each round of voting is “infinitesimal,” imagine there are MM rounds of bidding during any given time interval [t,t+Δt][t,t+\Delta t]. In each round participant kk gets either some stake(s) or nothing; so the average total number of stakes kk will get over the MM rounds is (by law of large numbers when MM is large),

Xk(t)N(t)N(t)ΔtMaverage number of stakes in each round×Mnumber of rounds=Xk(t)N(t)N(t)Δt.\underbrace{\frac{X_{k}(t)}{N(t)}\frac{N^{\prime}(t)\Delta t}{M}}_{\tiny\mbox{average number of stakes in each round}}\times\underbrace{M}_{\tiny\mbox{number of rounds}}=\quad\frac{X_{k}(t)}{N(t)}N^{\prime}(t)\Delta t.

Hence, replacing Δt\Delta t by the infinitesimal dtdt, we know participant kk will receive (on average) Xk(t)N(t)N(t)dt\frac{X_{k}(t)}{N(t)}N^{\prime}(t)dt stakes, where Xk(t)N(t)\frac{X_{k}(t)}{N(t)} is kk’s winning probability, and N(t)dtN^{\prime}(t)dt is the reward issued by the blockchain in [t,t+dt][t,t+dt].

Participants are allowed to trade (buy or sell) their stakes. Participant kk will buy νk(t)dt\nu_{k}(t)dt stakes in [t,t+dt][t,t+dt] if νk(t)>0\nu_{k}(t)>0, and sell νk(t)dt-\nu_{k}(t)dt stakes if νk(t)<0\nu_{k}(t)<0. This leads to the following dynamics of participant kk’s stakes under trading:

Xk(t)=νk(t)+N(t)N(t)Xk(t)for 0tτkT:=𝒯k,X^{\prime}_{k}(t)=\nu_{k}(t)+\frac{N^{\prime}(t)}{N(t)}X_{k}(t)\quad\mbox{for }0\leq t\leq\tau_{k}\wedge T:=\mathcal{T}_{k}, (2.2)

where τk:=inf{t>0:Xk(t)=0}\tau_{k}:=\inf\{t>0:X_{k}(t)=0\} is the first time at which the process Xk(t)X_{k}(t) reaches zero. It is reasonable to stop the trading process if a participant runs out of stakes, or gets all available stakes:

  • If 𝒯k=τk\mathcal{T}_{k}=\tau_{k}, then participant kk liquidates all his stakes by time τk\tau_{k}, and Xk(𝒯k)=0X_{k}(\mathcal{T}_{k})=0;

  • If 𝒯k=maxjkτj\mathcal{T}_{k}=\max_{j\neq k}\tau_{j}, then participant kk gets all issued stakes by time maxjkτj\max_{j\neq k}\tau_{j}, and hence Xk(𝒯k)=N(𝒯k)X_{k}(\mathcal{T}_{k})=N(\mathcal{T}_{k}).

We set Xk(t)=Xk(𝒯k)X_{k}(t)=X_{k}(\mathcal{T}_{k}) for t>𝒯kt>\mathcal{T}_{k}.

The problem is for each participant kk to decide how to trade stakes with others under the PoS protocol. Let {P(t), 0tT}\{P(t),\,0\leq t\leq T\} be the price process of each (unit of) stake, which is a stochastic process assumed to be independent of the dynamics in (2.2). (This assumption has appeared in recent studies (e.g., [21]), and is somehow a reflection of the reality that the crypto price tends to be affected by market shocks such as macroeconomics, geopolitics, breaking news, etc much more than by trading activities.) Here, the price P(t)P(t) of each stake is measured in terms of an underlying risk-free asset (referred to as “cash” for simplicity); and let bk(t)b_{k}(t) denote the (units of) risk-free asset that participant kk holds at time tt, and let r>0r>0 denote the risk-free (interest) rate. Also note that all KK participants are allowed to trade stakes (with cash) only internally among themselves, whereas each participants can only exchange cash with an external source (say, a bank).

The decision for each participant kk at tt is hence a tuple (νk(t),bk(t))(\nu_{k}(t),b_{k}(t)). Let {ck(t), 0tT}\{c_{k}(t),\,0\leq t\leq T\} be the process of consumption, or cash flow of participant kk, which follows the dynamics below:

dck(t)=rbk(t)dtdbk(t)P(t)νk(t)dt,0t𝒯k;dc_{k}(t)=rb_{k}(t)dt-db_{k}(t)-P(t)\nu_{k}(t)dt,\qquad 0\leq t\leq\mathcal{T}_{k}; (C1)

with

bk(0)=0,bk(t)0 for 0t𝒯k,0Xk(t)N(t) for 0t𝒯k.b_{k}(0)=0,\quad b_{k}(t)\geq 0\mbox{ for }0\leq t\leq\mathcal{T}_{k},\quad 0\leq X_{k}(t)\leq N(t)\mbox{ for }0\leq t\leq\mathcal{T}_{k}. (C2)

Set bk(t)=bk(𝒯k)b_{k}(t)=b_{k}(\mathcal{T}_{k}) and νk(t)=0\nu_{k}(t)=0 for t>𝒯kt>\mathcal{T}_{k}.

In (C1), if dbk(t)<0db_{k}(t)<0, the participant sells the risk-free asset to get cash either for buying stakes, or for consumption; if dbk(t)>0db_{k}(t)>0, the participant adds more risk-free asset. Thus, (C1) is a self-financing condition in which rbk(t)dtdbk(t)rb_{k}(t)dt-db_{k}(t) is the net change (in value of the risk-free asset held) used to finance new stakes P(t)νk(t)dtP(t)\nu_{k}(t)dt and consumption dc(t)dc(t). The requirements in (C2) are all in the spirit of disallowing shorting on either the risk free asset bk(t)b_{k}(t) or the stakes Xk(t)X_{k}(t). In some PoS blockchains, there is a minimum requirement for bidding (e.g. 32 ETHs for Ethereum). In this case, we can impose a lower bound on the process Xk(t)X_{k}(t), to prevent it from falling below this threshold. The analysis will be similar. We also require that the trading strategy be bounded: there is ν¯k>0\overline{\nu}_{k}>0 such that

|νk(t)|ν¯k.|\nu_{k}(t)|\leq\overline{\nu}_{k}. (C3)

The objective of participant kk is:

sup{(νk(t),bk(t))}\displaystyle\sup_{\{(\nu_{k}(t),b_{k}(t))\}} J(νk,bk):=𝔼{0𝒯keβkt[dck(t)+k(Xk(t))dt]+eβk𝒯k[bk(𝒯k)+hk(Xk(𝒯k)]}\displaystyle J(\nu_{k},b_{k}):=\mathbb{E}\left\{\int_{0}^{\mathcal{T}_{k}}e^{-\beta_{k}t}\left[dc_{k}(t)+\ell_{k}(X_{k}(t))dt\right]+e^{-\beta_{k}\mathcal{T}_{k}}\left[b_{k}(\mathcal{T}_{k})+h_{k}(X_{k}(\mathcal{T}_{k})\right]\right\} (2.3)
subject to (2.2),(C1),(C2),(C3),\displaystyle\mbox{ subject to }\eqref{eq:Xnu},(\mbox{C}1),(\mbox{C}2),(\mbox{C}3),

where βk>0\beta_{k}>0 is a discount factor, a parameter measuring the risk sensitivity of participant kk; k()\ell_{k}(\cdot) and hk()h_{k}(\cdot) are two utility functions representing, respectively, the running profit and the terminal profit.

While generally following Merton’s consumption-investment framework, our formulation as presented above takes into account some distinct features of PoS blockchains and cryptocurrencies. One notable point is, the utilities \ell and hh in the objective are expressed as functions of the number of stakes Xk(t)X_{k}(t), as opposed to their total value P(t)Xk(t)P(t)X_{k}(t). To the extent that P(t)P(t) is treated as exogenous (as explained above), this difference may seem to be trivial. Yet, it is a reflection of the more substantial fact that crypto-participants tend to mentally decouple the utility of holding stakes from their monetary value at any given time. For instance, holding 11 ETH may be equivalent to $5,000\$5,000 for one person, and $500\$500 for another, and neither will be influenced by the ETH market price at the time, which could be say, about $1,500\$1,500.

Throughout below, the following conditions will be assumed:

Assumption 2.1.
  1. (i)

    N:[0,T]+N:[0,T]\to\mathbb{R}_{+} is increasing with N(0)=N>0N(0)=N>0, and N𝒞2([0,T])N\in\mathcal{C}^{2}([0,T]).

  2. (ii)

    :++\ell:\mathbb{R}_{+}\to\mathbb{R}_{+} is increasing and 𝒞1(+)\ell\in\mathcal{C}^{1}(\mathbb{R}_{+}).

  3. (iii)

    h:++h:\mathbb{R}_{+}\to\mathbb{R}_{+} is increasing and h𝒞1(+)h\in\mathcal{C}^{1}(\mathbb{R}_{+}).

3. The Consumption-Investment Problem

Here we study the consumption-investment problem for participant kk in (2.3). To lighten notation, omit the subscript kk, and write out the problem in full as follows, where (C0) is a repeat of the state dynamics in (2.2):

U(x):=sup{(ν(t),b(t))}\displaystyle U(x):=\sup_{\{(\nu(t),b(t))\}} J(ν,b):=𝔼{0𝒯eβt[dc(t)+(X(t))dt]+eβ𝒯[b(𝒯)+h(X(𝒯)]}\displaystyle J(\nu,b):=\mathbb{E}\left\{\int_{0}^{\mathcal{T}}e^{-\beta t}\left[dc(t)+\ \ell(X(t))dt\right]+e^{-\beta\mathcal{T}}\left[b(\mathcal{T})+h(X(\mathcal{T})\right]\right\} (3.1)
subject to X(t)=ν(t)+N(t)N(t)X(t),X(0)=x,\displaystyle\mbox{ subject to }X^{\prime}(t)=\nu(t)+\frac{N^{\prime}(t)}{N(t)}X(t),\,X(0)=x, (C0)
dc(t)=rb(t)dtdb(t)P(t)ν(t)dt,\displaystyle\qquad\qquad\quad\,dc(t)=rb(t)dt-db(t)-P(t)\nu(t)dt, (C1)
b(0)=0,b(t)0 and 0X(t)N(t),\displaystyle\qquad\qquad\quad\,b(0)=0,\,b(t)\geq 0\mbox{ and }0\leq X(t)\leq N(t), (C2)
|ν(t)|ν¯.\displaystyle\qquad\qquad\quad\,|\nu(t)|\leq\overline{\nu}. (C3)

where 𝒯:=inf{t>0:X(t)=0 or N(t)}T\mathcal{T}:=\inf\{t>0:X(t)=0\mbox{ or }N(t)\}\wedge T.

Note that the expectation in the objective function is with respect to P(t)P(t), which is involved in dck(t)dc_{k}(t) via (C1). Denote

P~β(t):=eβt𝔼P(t),t[0,T].\widetilde{P}_{\beta}(t):=e^{-\beta t}\mathbb{E}P(t),\qquad t\in[0,T]. (3.2)

Substituting the constraint (C1) into the objective function, and taking into account

rb(t)dtdb(t)=ertd(ertb(t)),rb(t)dt-db(t)=-e^{rt}d(e^{-rt}b(t)),

along with (3.2), we have

J(ν,b)\displaystyle J(\nu,b) =\displaystyle= 0𝒯e(rβ)td(ertb(t))+eβ𝒯b(𝒯)\displaystyle-\int_{0}^{\mathcal{T}}e^{(r-\beta)t}d(e^{-rt}b(t))+e^{-\beta\mathcal{T}}b(\mathcal{T}) (3.3)
+0𝒯[P~β(t)ν(t)+eβt(X(t)]dt+eβ𝒯h(X(𝒯))J2(ν)\displaystyle\qquad+\underbrace{\int_{0}^{\mathcal{T}}\big{[}-\widetilde{P}_{\beta}(t)\nu(t)+e^{-\beta t}\ell(X(t)\big{]}dt+e^{-\beta\mathcal{T}}h(X(\mathcal{T}))}_{J_{2}(\nu)}
=\displaystyle= (rβ)0𝒯eβtb(t)𝑑t:=J1(b)+J2(ν),\displaystyle\underbrace{(r-\beta)\int_{0}^{\mathcal{T}}e^{-\beta t}b(t)dt}_{:=J_{1}(b)}\;+\;J_{2}(\nu),

where b(0)=0b(0)=0 is used in the last equality. Hence,

U(x):=sup{(ν,b)}J(ν,b)=supbJ1(b)+supνJ2(ν).U(x):=\sup_{\{(\nu,b)\}}J(\nu,b)=\sup_{b}J_{1}(b)+\sup_{\nu}J_{2}(\nu). (3.4)

Next, suppose βr\beta\geq r, a condition that will be assumed below (and readily justified as the risk premium associated with the valuation of any stake over the risk-free asset). Then, from the J1(b)J_{1}(b) expression in (3.3), and taking into account b(t)0b(t)\geq 0 as constrained in (C2), we have supbJ1(b)=0\sup_{b}J_{1}(b)=0 with the optimality binding at b(t)=0b_{*}(t)=0 for all tt. Consequently, the problem in (3.1) is reduced to

U(x)=supνJ2(ν)subject to (C0), (C2’), (C3),U(x)=\sup_{\nu}J_{2}(\nu)\quad\mbox{subject to (C0), (C2'), (C3)}, (3.5)

where (C2’) is (C2) without the constraints on b()b(\cdot).

In summary, the key fact here is that the objective U(x)U(x) is separable in the control variables (ν(t),b(t))(\nu(t),b(t)); hence the problem in (3.1) is decomposed into two optimal control problems, one on the risk-free asset b(t)b(t), and the other on the trading of stakes ν(t)\nu(t), as specified in (3.3) and (3.4). Moreover, under the condition βr\beta\geq r, the consumption-investment problem is reduced to the one in (3.5), where the objective function J2(ν)J_{2}(\nu) – refer to (3.3) – takes the form of a tradeoff between the utility from holding stakes ((X(t))\ell(X(t)) and h(X(𝒯))h(X(\mathcal{T}))) and the dis-utility of reducing consumption (P~β(t)ν(t)-\widetilde{P}_{\beta}(t)\nu(t)). Thus, the optimal trading strategy needs to strike a balance between these two opposing terms.

Before we present the optimal solution to the consumption-investment problem in (3.5), we make a digression to first study a simple degenerate case of P~β(t)0\widetilde{P}_{\beta}(t)\equiv 0. This special case removes the tradeoff mentioned above, so the solution becomes a one-sided strategy of always accumulating (or “hoarding”) the stakes at full capacity (ν¯\overline{\nu}). Yet, as the analysis below will show, there are still some interesting (and subtle) issues involved. More importantly, this special case provides a very accessible path to finding the optimal solution via dynamic programming and the HJB equation.

3.1. Stake-hoarding

As motived above, here the problem for participant kk is reduced to the following (again, omit the subscript kk):

U(x):=supν(t)\displaystyle U(x):=\sup_{\nu(t)}\,\, 0𝒯eβt(X(t))𝑑t+eβ𝒯h(X(𝒯))\displaystyle\int_{0}^{\mathcal{T}}e^{-\beta t}\ \ell(X(t))dt+e^{-\beta\mathcal{T}}h(X(\mathcal{T})) (3.6)
subject to  (C0), (C2’), (C3).\displaystyle\mbox{ subject to \quad(C0), (C2'), (C3)}.

Below, we denote ν(t)\nu_{*}(t) for the optimal control process, X(t)X_{*}(t) for the corresponding state process, and 𝒯:=inf{t>0:X(t)=N(t)}T\mathcal{T}_{*}:=\inf\{t>0:X_{*}(t)=N(t)\}\wedge T for the exit time.

Proposition 3.1.

Denote

γ(t):=ν¯N(t)0tdsN(s)+xN(t)Nfor 0tT.\gamma(t):=\overline{\nu}N(t)\int_{0}^{t}\frac{ds}{N(s)}+\frac{xN(t)}{N}\quad\mbox{for }0\leq t\leq T. (3.7)

We have:

  • (i)

    If ν¯0TdtN(t)NxN\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\leq\frac{N-x}{N}, then 𝒯=T\mathcal{T}_{*}=T. The optimal control is ν(t)=ν¯\nu_{*}(t)=\overline{\nu} for 0tT0\leq t\leq T, the optimal state process is X(t)=γ(t)X_{*}(t)=\gamma(t) for 0tT0\leq t\leq T, and U(x)=eβTh(X(T))+0Teβt(X(t))𝑑tU(x)=e^{-\beta T}h(X_{*}(T))+\int_{0}^{T}e^{-\beta t}\ell(X_{*}(t))dt.

  • (ii)

    If ν¯0TdtN(t)>NxN\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}>\frac{N-x}{N}, set

    t0:=inf{t>0:ν¯0tdsN(s)=NxN}<T.t_{0}:=\inf\left\{t>0:\overline{\nu}\int_{0}^{t}\frac{ds}{N(s)}=\frac{N-x}{N}\right\}<T.

    Assume further that

    h(N(t))+(N(t))βh(N(t))for all 0tT.h(N(t))^{\prime}+\ell(N(t))\leq\beta h(N(t))\quad\mbox{for all }0\leq t\leq T. (3.8)

    Then, 𝒯=t0\mathcal{T}_{*}=t_{0}. The optimal strategy is ν(t)=ν¯\nu_{*}(t)=\overline{\nu} for 0tt00\leq t\leq t_{0} (and ν(t)=0\nu_{*}(t)=0 for t>t0t>t_{0}), the optimal state process is X(t)=γ(t)X_{*}(t)=\gamma(t) for 0tt00\leq t\leq t_{0} (and X(t)=N(t0)X_{*}(t)=N(t_{0}) for t>t0t>t_{0}), and U(x)=eβt0h(X(t0))+0t0eβt(X(t))𝑑tU(x)=e^{-\beta t_{0}}h(X_{*}(t_{0}))+\int_{0}^{t_{0}}e^{-\beta t}\ell(X_{*}(t))dt.

Refer to caption
Figure 1. Optimal stake trading: concentration and monopoly.

Deferring the proof, we first make a few comments on the above proposition. Note that γ(t)\gamma(t) as specified in (3.7) is identified as the optimal state process X(t)X_{*}(t), which is the number of stakes given ν(t)=ν¯\nu_{*}(t)=\overline{\nu}. It is easy to see that the participant’s share of stakes, X(t)/N(t)X_{*}(t)/N(t), is increasing in tt, leading to centralization regardless of how the rewarding scheme is designed (although large rewards may slow down the speed towards concentration). The interesting point of the above theorem is in its part (ii), where the required condition (3.8) is a technical one, to ensure the optimality of ν(t)=ν¯\nu_{*}(t)=\overline{\nu}. The more substantive fact is 𝒯=t0<T\mathcal{T}_{*}=t_{0}<T, when X(𝒯)=N(𝒯)X(\mathcal{T}_{*})=N(\mathcal{T}_{*}), i.e., the participant has accumulated all stakes available in the system, leading to the extreme situation of monopoly (or “dictatorship”); and this is done before the end of the horizon, i.e., forcing a pre-matured exit time. See Figure 1 for an illustration.

The following corollary illustrates further this extreme phenomenon, with the polynomial family Nα(t)N_{\alpha}(t) defined by (2.1), and with a long time horizon (TT\to\infty).

Corollary 3.2.

Let (Nα(t), 0tT)(N_{\alpha}(t),\,0\leq t\leq T) be defined by (2.1), (Xα,(t), 0tT)(X_{\alpha,*}(t),\,0\leq t\leq T) be the optimal state process defined by (3.7) corresponding to Nα(t)N_{\alpha}(t), and 𝒯α,:=inf{t>0:Xα,(t)=Nα(t)}\mathcal{T}_{\alpha,*}:=\inf\{t>0:X_{\alpha,*}(t)=N_{\alpha}(t)\} be the exit time. Assume that the condition (3.8) holds for Nα(t)N_{\alpha}(t). Then, as TT\to\infty, we have

  • (i)

    For α>1\alpha>1,

    • if ν¯(α1)(Nx)N1α\overline{\nu}\leq(\alpha-1)(N-x)N^{-\frac{1}{\alpha}}, then Xα,(t)<Nα(t)X_{\alpha,*}(t)<N_{\alpha}(t) for all tt. Moreover,

      limtXα,(t)Nα(t)=ν¯α1N1αα+xN.\lim_{t\to\infty}\frac{X_{\alpha,*}(t)}{N_{\alpha}(t)}=\frac{\overline{\nu}}{\alpha-1}N^{\frac{1-\alpha}{\alpha}}+\frac{x}{N}. (3.9)
    • if ν¯>(α1)(Nx)N1α\overline{\nu}>(\alpha-1)(N-x)N^{-\frac{1}{\alpha}}, then 𝒯α,<\mathcal{T}_{\alpha,*}<\infty.

  • (ii)

    For α1\alpha\leq 1, we have 𝒯α,<\mathcal{T}_{\alpha,*}<\infty.

Proof.

Note that

0TdtNα(t)={11α((T+N1α)1αN1αα)for α1log(1+T/N)for α=1.\int_{0}^{T}\frac{dt}{N_{\alpha}(t)}=\left\{\begin{array}[]{lcl}\frac{1}{1-\alpha}\left((T+N^{\frac{1}{\alpha}})^{1-\alpha}-N^{\frac{1-\alpha}{\alpha}}\right)&\mbox{for }\alpha\neq 1\\ \log\left(1+T/N\right)&\mbox{for }\alpha=1.\end{array}\right. (3.10)

As TT\to\infty, the dominant term in 11α((T+N1α)1αN1αα)\frac{1}{1-\alpha}\left((T+N^{\frac{1}{\alpha}})^{1-\alpha}-N^{\frac{1-\alpha}{\alpha}}\right) is 1α1N1αα\frac{1}{\alpha-1}N^{\frac{1-\alpha}{\alpha}} if α>1\alpha>1, and is 11αT1α\frac{1}{1-\alpha}T^{1-\alpha} if α<1\alpha<1; and the dominant term in log(1+T/N)\log\left(1+T/N\right) is logT\log T. It then suffices to compare ν¯0TdtNα(t)\overline{\nu}\int_{0}^{T}\frac{dt}{N_{\alpha}(t)} to NxN\frac{N-x}{N}, and the rest of the corollary is immediate. ∎

This corollary shows a sharp phase transition towards monopoly in terms of the rewarding schemes. For α>1\alpha>1 (increasing reward), there is a threshold for ν¯\overline{\nu}, only above which monopoly may occur, and below which the share of stakes increases towards the value on the right side of (3.9). For α1\alpha\leq 1 (constant or decreasing reward), monopoly always occurs. Thus, these results have practical implications in the design of the PoS protocol. For instance, if/when certain participants have large capacities, adopting a suitable increasing reward scheme will counter the effect of concentration.

Now, returning to the proof of Proposition 3.1, we use the standard machinery of dynamic programming and the HJB equation. Consider the following problem, where V(t,x)V(t,x) is the “value-to-go” function, for 0tT0\leq t\leq T and 0xN(t)0\leq x\leq N(t):

V(t,x):=\displaystyle V(t,x):= max{ν(s),st}t𝒯eβs(X(s))𝑑s+eβ𝒯h(X(𝒯))\displaystyle\max_{\{\nu(s),s\geq t\}}\,\,\int_{t}^{\mathcal{T}}e^{-\beta s}\ \ell(X(s))ds+e^{-\beta\mathcal{T}}h(X(\mathcal{T}))
subject to X(s)=ν(s)+N(s)N(s)X(s),X(t)=x,\displaystyle\mbox{ subject to }X^{\prime}(s)=\nu(s)+\frac{N^{\prime}(s)}{N(s)}X(s),\,X(t)=x,
 0X(s)N(s),\displaystyle\qquad\qquad\quad\,0\leq X(s)\leq N(s),
|ν(s)|ν¯.\displaystyle\qquad\qquad\quad\,|\nu(s)|\leq\overline{\nu}.

Clearly, the solution to the above problem concerning V(t,x)V(t,x), for all t[0,T]t\in[0,T] and x[0,N(t)]x\in[0,N(t)], will yield the desired solution to U(x)U(x) in (3.6), since U(x)=V(0,x)U(x)=V(0,x). The following lemma identifies an HJB equation (with terminal and boundary conditions), to which. V(t,x)V(t,x) is a solution.

Lemma 3.3.

Let Q:={(t,x):0t<T, 0<x<N(t)}Q:=\{(t,x):0\leq t<T,\,0<x<N(t)\}. Then VV is the (unique) viscosity solution to the following HJB equation:

{tv+eβt(x)+xN(t)N(t)xv+sup|ν|ν¯{νxv}=0(t,x)Q,v(T,x)=eβTh(x),v(t,0)=eβth(0),v(t,N(t))=eβth(N(t)).\left\{\begin{array}[]{lcl}\partial_{t}v+e^{-\beta t}\ell(x)+\frac{xN^{\prime}(t)}{N(t)}\partial_{x}v+\sup_{|\nu|\leq\overline{\nu}}\{\nu\,\partial_{x}v\}=0\quad(t,x)\in Q,\\ v(T,x)=e^{-\beta T}h(x),\\ v(t,0)=e^{-\beta t}h(0),\,\,v(t,N(t))=e^{-\beta t}h(N(t)).\end{array}\right. (3.11)
Proof.

Write the HJB equation as tv+H(t,x,xv)=0\partial_{t}v+H(t,x,\partial_{x}v)=0, where

H(t,x,p):=eβt(x)+xN(t)N(t)p+sup|ν|ν¯{νp}.H(t,x,p):=e^{-\beta t}\ell(x)+\frac{xN^{\prime}(t)}{N(t)}p+\sup_{|\nu|\leq\overline{\nu}}\{\nu p\}.

The fact that VV as specified in (3.1) is a viscosity solution follows a standard dynamic programming argument, see [11, Chapter II, Section 7].

Moreover, from the conditions in Assumption 2.1, we have,

|H(t,x,p)H(s,y,q)|C(|ts|+|xy|+|pq|+|xy||p|+|ts||p|),|H(t,x,p)-H(s,y,q)|\leq C(|t-s|+|x-y|+|p-q|+|x-y||p|+|t-s||p|), (3.12)

for 0s,tT0\leq s,t\leq T and 0x,yN(t)0\leq x,y\leq N(t), and for some C>0C>0. By [11, Chapter II, Corollary 9.1], the HJB equation in (3.11) has a unique viscosity solution, which then must be none other than VV. ∎

What remains is to pin down the term sup|ν|ν¯{νxv}\sup_{|\nu|\leq\overline{\nu}}\{\nu\,\partial_{x}v\} in the HJB equation, i.e., to identify the maximizing ν\nu. Given the intuitive solution that ν=ν¯>0\nu=\overline{\nu}>0 (a “conjecture,” so far), the HJB equation in 3.11 is expected to be

{tv+eβt(x)+(ν¯+xN(t)N(t))xv=0in Q,v(T,x)=eβTh(x),v(t,0)=eβth(0),v(t,N(t))=eβth(N(t)),\left\{\begin{array}[]{lcl}\partial_{t}v+e^{-\beta t}\ell(x)+\left(\overline{\nu}+\frac{xN^{\prime}(t)}{N(t)}\right)\partial_{x}v=0\,\,\mbox{in }Q,\\ v(T,x)=e^{-\beta T}h(x),\\ v(t,0)=e^{-\beta t}h(0),\,\,v(t,N(t))=e^{-\beta t}h(N(t)),\end{array}\right. (3.13)

which is a transport equation with variable coefficients. Now we solve the transport equation (3.13) by the method of characteristics. For 0tT0\leq t\leq T and 0xN(t)0\leq x\leq N(t), let γt,x(s)\gamma_{t,x}(s) be the solution to the following equation:

γt,x(s)=ν¯+N(s)N(s)γt,x(s),s>t;γt,x(t)=x.\gamma^{\prime}_{t,x}(s)=\overline{\nu}+\frac{N^{\prime}(s)}{N(s)}\gamma_{t,x}(s),\quad s>t;\qquad\gamma_{t,x}(t)=x. (3.14)

A direct computation yields

γt,x(s)=ν¯N(s)tsduN(u)+xN(s)N(t),st.\gamma_{t,x}(s)=\overline{\nu}N(s)\int_{t}^{s}\frac{du}{N(u)}+\frac{xN(s)}{N(t)},\quad s\geq t. (3.15)

Under the regularity conditions in Assumption 2.1, it is standard that (see e.g. [2, 12])

v(t,x)=eβ𝒯t,xh(γt,x(𝒯t,x))+t𝒯t,xeβs(γt,x(s))𝑑s,v(t,x)=e^{-\beta\mathcal{T}_{t,x}}h(\gamma_{t,x}(\mathcal{T}_{t,x}))+\int_{t}^{\mathcal{T}_{t,x}}e^{-\beta s}\ell(\gamma_{t,x}(s))ds, (3.16)

where 𝒯t,x:=inf{s>t:γt,x(s)=N(s)}T\mathcal{T}_{t,x}:=\inf\{s>t:\gamma_{t,x}(s)=N(s)\}\wedge T. We will next show that v(t,x)v(t,x) given by (3.16) indeed solves the HJB equation (3.11), which then proves Proposition 3.1.

Proof of Proposition 3.1.

From the expression of γt,x(s)\gamma_{t,x}(s) in (3.15), we have

xγt,x(s)=N(s)N(t)>0andx𝒯t,x0.\partial_{x}\gamma_{t,x}(s)=\frac{N(s)}{N(t)}>0\quad\mbox{and}\quad\partial_{x}\mathcal{T}_{t,x}\leq 0. (3.17)

Note that γt,x(s)/N(s)\gamma_{t,x}(s)/N(s) is increasing in ss. There are two cases.

Case 1: If γt,x(T)/N(T)1\gamma_{t,x}(T)/N(T)\leq 1, then 𝒯t,x=T\mathcal{T}_{t,x}=T and hence, v(t,x)=eβTh(γt,x(T))+tTeβs(γt,x(s))𝑑sv(t,x)=e^{-\beta T}h(\gamma_{t,x}(T))+\int_{t}^{T}e^{-\beta s}\ell(\gamma_{t,x}(s))ds. By the regularity conditions in Assumption 2.1, we get

xv=eβTN(T)N(t)h(γt,x(T))+tTeβsN(s)N(t)(γt,x(s))𝑑s0,\partial_{x}v=e^{-\beta T}\frac{N(T)}{N(t)}h^{\prime}(\gamma_{t,x}(T))+\int_{t}^{T}e^{-\beta s}\frac{N(s)}{N(t)}\ell^{\prime}(\gamma_{t,x}(s))ds\geq 0,

where the non-negativity follows from the fact that N(t)>0N(t)>0 and ,h\ell,h are increasing.

Case 2: If γt,x(T)/N(T)>1\gamma_{t,x}(T)/N(T)>1, then 𝒯t,x<T\mathcal{T}_{t,x}<T, and hence v(t,x)=eβ𝒯t,xh(N(𝒯t,x))+t𝒯t,xeβs(γt,x(s))𝑑sv(t,x)=e^{-\beta\mathcal{T}_{t,x}}h(N(\mathcal{T}_{t,x}))+\int_{t}^{\mathcal{T}_{t,x}}e^{-\beta s}\ell(\gamma_{t,x}(s))ds. As a result,

xv\displaystyle\partial_{x}v =βeβ𝒯t,x(x𝒯t,x)h(N(𝒯t,x))+eβ𝒯t,x(x𝒯t,x)(hN)(𝒯t,x)\displaystyle=-\beta e^{-\beta\mathcal{T}_{t,x}}(\partial_{x}\mathcal{T}_{t,x})h(N(\mathcal{T}_{t,x}))+e^{-\beta\mathcal{T}_{t,x}}(\partial_{x}\mathcal{T}_{t,x})(h\circ N)^{\prime}(\mathcal{T}_{t,x})
+t𝒯t,xeβsN(s)N(t)(γt,x(s))𝑑s+eβ𝒯t,x(x𝒯t,x)(N(𝒯t,x))\displaystyle\quad\quad+\int_{t}^{\mathcal{T}_{t,x}}e^{-\beta s}\frac{N(s)}{N(t)}\ell^{\prime}(\gamma_{t,x}(s))ds+e^{-\beta\mathcal{T}_{t,x}}(\partial_{x}\mathcal{T}_{t,x})\ell(N(\mathcal{T}_{t,x}))
=t𝒯t,xeβsN(s)N(t)(γt,x(s))𝑑s\displaystyle=\int_{t}^{\mathcal{T}_{t,x}}e^{-\beta s}\frac{N(s)}{N(t)}\ell^{\prime}(\gamma_{t,x}(s))ds
eβ𝒯t,x(x𝒯t,x)0 by (3.17)(N(hN)+βhN)0 by (3.8)(𝒯t,x)\displaystyle\quad\quad-e^{-\beta\mathcal{T}_{t,x}}\underbrace{(\partial_{x}\mathcal{T}_{t,x})}_{\leq 0\mbox{ \small by }\eqref{eq:310}}\underbrace{(-\ell\circ N-(h\circ N)^{\prime}+\beta h\circ N)}_{\geq 0\mbox{ \small by }\eqref{eq:tech}}(\mathcal{T}_{t,x})
0.\displaystyle\geq 0.

So, in both cases, we have xv(t,x)0\partial_{x}v(t,x)\geq 0. Thus, v(t,x)v(t,x) defined by (3.16) is a classical solution and hence, a viscosity solution to the HJB equation in (3.11). By Lemma 3.3, we conclude V(t,x)=v(t,x)V(t,x)=v(t,x), and the optimal control is ν(s)=ν¯\nu_{*}(s)=\overline{\nu} for sts\geq t. Specializing to t=0t=0 yields the results in Proposition 3.1 (and γ(t)\gamma(t) defined by (3.7) is just γ0,x(t)\gamma_{0,x}(t)). ∎

3.2. Main theorem and proof

We are now ready to present the main result of this section, the optimal solution to U(x)U(x) in (3.5) and hence to U(x)U(x) in (3.1).

Theorem 3.4.

Assume that rβr\leq\beta, and P~β(t)\widetilde{P}_{\beta}(t) in (3.2) satisfies the Lipschitz condition:

|P~β(t)P~β(s)|C|ts|for some C>0.|\widetilde{P}_{\beta}(t)-\widetilde{P}_{\beta}(s)|\leq C|t-s|\quad\mbox{for some }C>0. (3.18)

Then, U(x)=v(0,x)U(x)=v(0,x) where v(t,x)v(t,x) is the unique viscosity solution to the following HJB equation, where Q:={(t,x):0t<T, 0<x<N(t)}Q:=\{(t,x):0\leq t<T,\,0<x<N(t)\}:

{tv+eβt(x)+xN(t)N(t)xv+sup|ν|ν¯{ν(xvP~β(t))}=0in Q,v(T,x)=eβTh(x),v(t,0)=eβth(0),v(t,N(t))=eβth(N(t)).\left\{\begin{array}[]{lcl}\partial_{t}v+e^{-\beta t}\ell(x)+\frac{xN^{\prime}(t)}{N(t)}\partial_{x}v+\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\}=0\quad\mbox{in }Q,\\ v(T,x)=e^{-\beta T}h(x),\\ v(t,0)=e^{-\beta t}h(0),\,\,v(t,N(t))=e^{-\beta t}h(N(t)).\end{array}\right. (3.19)

Moreover, the optimal strategy is b(t)=0b_{*}(t)=0 and ν(t)=ν(t,X(t))\nu_{*}(t)=\nu_{*}(t,X_{*}(t)) for 0t𝒯0\leq t\leq\mathcal{T}_{*}, where ν(t,x)\nu_{*}(t,x) achieves the supremum in (3.19), and X(t)X_{*}(t) solves X(t)=ν(t,X(t))+N(t)N(t)X(t)X_{*}^{\prime}(t)=\nu_{*}(t,X_{*}(t))+\frac{N^{\prime}(t)}{N(t)}X_{*}(t) with X(0)=xX_{*}(0)=x, and 𝒯:=inf{t>0:X(t)=0 or N(t)}T\mathcal{T}_{*}:=\inf\{t>0:X_{*}(t)=0\mbox{ or }N(t)\}\wedge T.

Proof.

Similar to the dynamic programming/HJB approach that proves Lemma 3.3 and Proposition 3.1 above, here we consider

V2(t,x):=\displaystyle V_{2}(t,x):= maxν(s)t𝒯(P~β(s)ν(s)+eβs(X(s))ds+eβ𝒯h(X(𝒯))\displaystyle\max_{\nu(s)}\,\,\int_{t}^{\mathcal{T}}(-\widetilde{P}_{\beta}(s)\nu(s)+e^{-\beta s}\ell(X(s))ds+e^{-\beta\mathcal{T}}h(X(\mathcal{T}))
subject to X(s)=ν(s)+N(s)N(s)X(s),X(t)=x,\displaystyle\mbox{ subject to }X^{\prime}(s)=\nu(s)+\frac{N^{\prime}(s)}{N(s)}X(s),\,X(t)=x,
 0X(s)N(s),\displaystyle\qquad\qquad\quad\,0\leq X(s)\leq N(s),
|ν(s)|ν¯,\displaystyle\qquad\qquad\quad\,|\nu(s)|\leq\overline{\nu},

so that U(x)=V2(0,x)U(x)=V_{2}(0,x). By the same dynamic programming argument as above, V2V_{2} solves in the viscosity sense the HJB equation in (3.19), which can be expressed as tv+H(t,x,xv)=0\partial_{t}v+H(t,x,\partial_{x}v)=0, with

H(t,x,p):=eβt(x)+xN(t)N(t)p+sup|ν|ν¯{ν(pP~β(t))}.H(t,x,p):=e^{-\beta t}\ell(x)+\frac{xN^{\prime}(t)}{N(t)}p+\sup_{|\nu|\leq\overline{\nu}}\{\nu(p-\widetilde{P}_{\beta}(t))\}.

It is readily checked that under Assumption 2.1 and the Liptschiz condition in (3.18), the inequality in (3.12) holds. Thus, V2V_{2} as identified above is the unique viscosity to the HJB equation in (3.19). The rest of the theorem is straightforward. ∎

Comparing the HJB equations in (3.11) and in (3.19), we see the nonlinear term changes from sup|ν|ν¯{νxv}\sup_{|\nu|\leq\overline{\nu}}\{\nu\partial_{x}v\} in the stake-hoarding problem, to sup|ν|ν¯{ν(xvP~β(t))}\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\} in the stake-trading problem, the latter being the general consumption-investment problem. The more general HJB equation in (3.19) does not have a closed-form solution, and neither does the optimal trading strategy ν(t)\nu_{*}(t). This calls for numerical methods; see e.g. [19, 23].

4. Linear and Convex Utilities

4.1. Linear utility

Consider the special case of linear utility, (x)=x\ell(x)=\ell x and h(x)=hxh(x)=hx, for some given (positive) constants \ell and hh. In this case we can derive a closed-form solution to the HJB equation in (3.19), and then derive the optimal strategy ν(t)\nu_{*}(t) (in terms of P~β(t)\widetilde{P}_{\beta}(t)).

To start with, the HJB equation in (3.19) now specializes to the following, with Q:={(t,x):0t<T, 0<x<N(t)}Q:=\{(t,x):0\leq t<T,\,0<x<N(t)\} (as before, refer to Lemma 3.3):

{tv+eβtx+xN(t)N(t)xv+sup|ν|ν¯{ν(xvP~β(t))}=0(t,x)Q,v(T,x)=hx,v(t,0)=0,v(t,N(t))=hN(t).\left\{\begin{array}[]{lcl}\partial_{t}v+\ell e^{-\beta t}x+\frac{xN^{\prime}(t)}{N(t)}\partial_{x}v+\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\}=0\quad(t,x)\in Q,\\ v(T,x)=hx,\\ v(t,0)=0,\,v(t,N(t))=hN(t).\end{array}\right. (4.1)

For the nonlinear term sup|ν|ν¯{ν(xvP~β(t))}\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\}, we have ν(t,x)=ν¯\nu_{*}(t,x)=\overline{\nu} if xv(t,x)P~β(t)\partial_{x}v(t,x)\geq\widetilde{P}_{\beta}(t), and ν(t,x)=ν¯\nu_{*}(t,x)=\overline{\nu} if xv(t,x)<P~β(t)\partial_{x}v(t,x)<\widetilde{P}_{\beta}(t).

Next, presuming that xvP~β(t)\partial_{x}v\geq\widetilde{P}_{\beta}(t), and ignoring the boundary conditions, the HJB equation in (4.1) becomes

tv+eβtxν¯P~β(t)+(ν¯+xN(t)N(t))xv=0,v(T,x)=hx,\partial_{t}v+\ell e^{-\beta t}x-\overline{\nu}\widetilde{P}_{\beta}(t)+\left(\overline{\nu}+\frac{xN^{\prime}(t)}{N(t)}\right)\partial_{x}v=0,\quad v(T,x)=hx,

which has the (classical) solution

v+(t,x):=heβTγt,x+(T)+tT[eβsγt,x+(s)ν¯P~β(s)]𝑑s,v^{+}(t,x):=he^{-\beta T}\gamma^{+}_{t,x}(T)+\int_{t}^{T}\left[\ell e^{-\beta s}\gamma^{+}_{t,x}(s)-\overline{\nu}\widetilde{P}_{\beta}(s)\right]ds, (4.2)

where

γt,x+(s):=ν¯N(s)tsduN(u)+xN(s)N(t),s[t,T].\gamma^{+}_{t,x}(s):=\overline{\nu}N(s)\int_{t}^{s}\frac{du}{N(u)}+\frac{xN(s)}{N(t)},\quad s\in[t,T]. (4.3)

Similarly, presuming that xv<P~β(t)\partial_{x}v<\widetilde{P}_{\beta}(t) and neglecting the boundary conditions turns the HJB equation in (3.19) into the following form:

tv+eβtx+ν¯P~β(t)+(ν¯+xN(t)N(t))xv=0,v(T,x)=hx,\partial_{t}v+\ell e^{-\beta t}x+\overline{\nu}\widetilde{P}_{\beta}(t)+\left(-\overline{\nu}+\frac{xN^{\prime}(t)}{N(t)}\right)\partial_{x}v=0,\quad v(T,x)=hx,

which has the solution

v(t,x):=heβTγt,x(T)+tT[eβsγt,x(s)+ν¯P~β(s)]𝑑s,v^{-}(t,x):=he^{-\beta T}\gamma_{t,x}^{-}(T)+\int_{t}^{T}\left[\ell e^{-\beta s}\gamma^{-}_{t,x}(s)+\overline{\nu}\widetilde{P}_{\beta}(s)\right]ds, (4.4)

where

γt,x(s):=ν¯N(s)tsduN(u)+xN(s)N(t),s[t,T].\gamma^{-}_{t,x}(s):=-\overline{\nu}N(s)\int_{t}^{s}\frac{du}{N(u)}+\frac{xN(s)}{N(t)},\quad s\in[t,T]. (4.5)

The key observation is that

xv+(t,x)=xv(t,x)=1N(t)(heβTN(T)+tTeβsN(s)𝑑s):=Ψ(t);\partial_{x}v^{+}(t,x)=\partial_{x}v^{-}(t,x)=\underbrace{\frac{1}{N(t)}\left(he^{-\beta T}N(T)+\ell\int_{t}^{T}e^{-\beta s}N(s)ds\right)}_{:=\Psi(t)}; (4.6)

and Ψ(t)\Psi(t), notably independent of xx, is decreasing in t[0,T]t\in[0,T]:

Ψ(0)=heβTN(T)N+N0TeβtN(t)𝑑t()Ψ(t)()Ψ(T)=heβT.\displaystyle\Psi(0)=he^{-\beta T}\frac{N(T)}{N}+\frac{\ell}{N}\int_{0}^{T}e^{-\beta t}N(t)dt\;\downarrow(\geq)\;\Psi(t)\;\downarrow(\geq)\;\Psi(T)=he^{-\beta T}. (4.7)

This suggests that ν(t)=ν¯\nu_{*}(t)=\overline{\nu} (buy all the time) if sup[0,T]P~β(t)Ψ(T)\sup_{[0,T]}\widetilde{P}_{\beta}(t)\leq\Psi(T); and ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} (sell all the time) if inf[0,T]P~β(t)Ψ(0)\inf_{[0,T]}\widetilde{P}_{\beta}(t)\geq\Psi(0). Various other scenarios are also possible, such as first buy then sell, or first sell then buy, and so forth.

The following proposition classifies all possible optimal strategies corresponding to P~β(t)\widetilde{P}_{\beta}(t) as specified above, which we will comment on later.

Proposition 4.1.

Let (x)=x\ell(x)=\ell x and h(x)=hxh(x)=hx with ,h>0\ell,h>0, and N(t)N(t) satisfy Assumption 2.1 (i). Assume that P~β(t)\widetilde{P}_{\beta}(t) satisfies the Lipschitz condition in (3.18), and that ν¯\overline{\nu} satisfies the following:

ν¯0TdtN(t)xNNxN.\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\leq\frac{x}{N}\wedge\frac{N-x}{N}. (4.8)

Then, the following results hold:

  • (i)

    Suppose P~β(t)\widetilde{P}_{\beta}(t) stays constant, i.e., for all t[0,T]t\in[0,T], P~(t)=P~(0)=P(0)\widetilde{P}(t)=\widetilde{P}(0)=P(0).

    1. (a)

      If P(0)Ψ(0)P(0)\geq\Psi(0), then ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} for all 0tT0\leq t\leq T. That is, the participant sells at all time at full capacity.

    2. (b)

      If P(0)Ψ(T)P(0)\leq\Psi(T), then ν(t)=ν¯\nu_{*}(t)=\overline{\nu}. That is, the participant purchases at all time at full capacity.

    3. (c)

      If Ψ(T)<P(0)<Ψ(0)\Psi(T)<P(0)<\Psi(0), then

      ν(t)={ν¯for tt0,ν¯for t>t0,\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{0},\\ -\overline{\nu}&\mbox{for }t>t_{0},\end{array}\right.

      where t0t_{0} is the unique point in [0,T][0,T] such that P(0)=Ψ(t0)P(0)=\Psi(t_{0}) with Ψ(t)\Psi(t) defined in (4.6). That is, the participant first buys and after some time sells, both at full capacity.

  • (ii)

    Suppose that P~β(t)\widetilde{P}_{\beta}(t) is increasing in t[0,T]t\in[0,T].

    1. (a)

      If P(0)Ψ(0)P(0)\geq\Psi(0), then ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} for all 0tT0\leq t\leq T. That is, the participant sells all the time at full capacity.

    2. (b)

      If P~β(T)Ψ(T)\widetilde{P}_{\beta}(T)\leq\Psi(T), then ν(t)=ν¯\nu_{*}(t)=\overline{\nu}. That is, the participant purchases all the time at full capacity.

    3. (c)

      If P(0)<Ψ(0)P(0)<\Psi(0) and P~β(T)>Ψ(T)\widetilde{P}_{\beta}(T)>\Psi(T), then

      ν(t)={ν¯for tt0,ν¯for t>t0,\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{0},\\ -\overline{\nu}&\mbox{for }t>t_{0},\end{array}\right.

      where t0t_{0} is the unique point of intersection of P~β(t)\widetilde{P}_{\beta}(t) and Ψ(t)\Psi(t) on [0,T][0,T]. That is, the participant first buys and after some time sells, both at full capacity.

  • (iii)

    Suppose that P~β(t)\widetilde{P}_{\beta}(t) is decreasing in t[0,T]t\in[0,T].

    1. (a)

      If P(0)Ψ(0)P(0)\geq\Psi(0), then the participant first sells, and may then buy, etc, always (buy or sell) at full capacity, according to the crossings of P~β(t)\widetilde{P}_{\beta}(t) and Ψ(t)\Psi(t) in [0,T][0,T].

    2. (b)

      If P(0)<Ψ(0)P(0)<\Psi(0), then the participant first buys, and may then sell, etc, always (buy or sell) at full capacity, according to the crossings of P~β(t)\widetilde{P}_{\beta}(t) and Ψ(t)\Psi(t) in [0,T][0,T].

Proof.

Recall that X(t)X_{*}(t) is the state process (number of stakes) corresponding to the optimal strategy ν(t)\nu_{*}(t), which, as stipulated in the rest of the proposition, will be equal to either ν¯\overline{\nu} or ν¯-\overline{\nu}. The condition in (4.8) then ensures that 0X(t)N(t)0\leq X_{*}(t)\leq N(t) for all t[0,T]t\in[0,T], so 𝒯=T\mathcal{T}_{*}=T (i.e., there will no forced early exit).

Thus, it suffices to find the optimal strategy ν(t)\nu_{*}(t) from

sup|ν|ν¯{ν[xvP~β(t)]}=sup|ν|ν¯{ν[Ψ(t)P~β(t)]}.\sup_{|\nu|\leq\overline{\nu}}\{\nu[\partial_{x}v-\widetilde{P}_{\beta}(t)]\}=\sup_{|\nu|\leq\overline{\nu}}\{\nu[\Psi(t)-\widetilde{P}_{\beta}(t)]\}.

(i) and (ii). Since Ψ(t)\Psi(t) is decreasing and P~(t)\widetilde{P}(t) is either constant or increasing, Ψ(t)P(t)\Psi(t)-P(t) is decreasing. Hence, we have the following cases (for both (i) and (ii)).

(a) If P~(0)=P(0)Ψ(0)\widetilde{P}(0)=P(0)\geq\Psi(0), then P~(t)Ψ(t)\widetilde{P}(t)\geq\Psi(t) for all t[0,T]t\in[0,T]; hence, ν(t)=ν¯\nu_{*}(t)=-\overline{\nu}, and U(x)=v(0,x)U(x)=v^{-}(0,x).

(b) Similarly, if P(0)Ψ(T)P(0)\leq\Psi(T), then P~(t)Ψ(t)\widetilde{P}(t)\leq\Psi(t) for all t[0,T]t\in[0,T]; hence, ν(t)=ν¯\nu_{*}(t)=\overline{\nu}, and U(x)=v+(0,x)U(x)=v^{+}(0,x).

(c) Otherwise, there will be a unique point for Ψ(t)P(t)\Psi(t)-P(t) (which is decreasing in tt) to cross 0 from above, and let t0[0,T]t_{0}\in[0,T] denote the crossing point. This implies that ν(t)=ν¯\nu_{*}(t)=\overline{\nu} for tt0t\leq t_{0}, and ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} for t>t0t>t_{0}; and

U(x)=v+(0,x)v+(t0,γ0,x+(t0))+v(t0,γ0,x+(t0)).U(x)=v^{+}(0,x)-v^{+}(t_{0},\gamma^{+}_{0,x}(t_{0}))+v^{-}(t_{0},\gamma^{+}_{0,x}(t_{0})).

Part (iii) is similarly argued, the only complication is that Ψ(t)P(t)\Psi(t)-P(t) is now non-monotone, and hence, there will be multiple points when it crosses 0. ∎

Several remarks are in order. First note that the condition in (4.8) is to guarantee the constraint (C2’) not activated prior to TT; that is, to exclude the possibility of monopoly/dictatorship that will trigger a forced early exit. This condition may well be removed, but then we would expect another condition similar to the one in (3.8) to guarantee the optimality of a strategy when an early exit occurs.

Second, P~β(t)=𝔼[eβtP(t)]\widetilde{P}_{\beta}(t)=\mathbb{E}\big{[}e^{-\beta t}P(t)\big{]} combines β\beta, which measures the participant’s sensitivity towards risk, with the stake price P(t)P(t). Thus, the monotone properties of P~β(t)\widetilde{P}_{\beta}(t), which classify the three parts (i)-(iii) in Proposition 4.1, naturally connect to martingale pricing: P~β(t)\widetilde{P}_{\beta}(t) being a constant in (i) makes the process eβtP(t)e^{-\beta t}P(t) a martingale; whereas P~β(t)\widetilde{P}_{\beta}(t) increasing or decreasing, respectively in (ii) and (iii), makes eβtP(t)e^{-\beta t}P(t) a sub-martingale or a super-martingale.

On the other hand, the function Ψ(t)=xv+(t,x)=xv(t,x)\Psi(t)=\partial_{x}v^{+}(t,x)=\partial_{x}v^{-}(t,x) represents the rate of return of the participant’s utility (from holding of stakes, xx); and interestingly, in the linear utility case, this return rate is independent of xx while decreasing in tt. Thus, the trading strategy is completely determined by comparing this return rate Ψ(t)\Psi(t) with the participant’s risk-adjusted stake price (or, valuation) P~β(t)\widetilde{P}_{\beta}(t): if Ψ(t)(resp.<)P~β(t)\Psi(t)\geq({\rm resp.}<)\widetilde{P}_{\beta}(t), then the participant will buy (resp. sell) stakes.

Specifically, following (i) and (ii) of Proposition 4.1, for a constant or an increasing P~β(t)\widetilde{P}_{\beta}(t) (corresponding to a risk-neutral or risk-seeking participant), there are only three possible optimal strategies: buy all the time, sell all the time, or first buy then sell. (The first-buy-then-sell strategy echoes the general investment practice that an early investment pays off in a later day.) See Figure 2 for an illustration.

Refer to caption
Refer to caption
Figure 2. Optimal stake trading with linear (),h()\ell(\cdot),h(\cdot) when P~β(t)\widetilde{P}_{\beta}(t) is constant (left) and P~β(t)\widetilde{P}_{\beta}(t) is increasing (right).

4.2. A special case

In part (iii) of Proposition 4.1, when P~β(t)\widetilde{P}_{\beta}(t) is decreasing in tt, like Ψ(t)\Psi(t), the multiple crossings between the two decreasing functions can be further pinned down when there’s more model structure. Consider, for instance, when P(t)P(t) follows a geometric Brownian motion (GBM):

dP(t)P(t)=μdt+σdBt,orP(t)=P(0)e(μσ2/2)t+σBt;t[0,T],\frac{dP(t)}{P(t)}=\mu dt+\sigma dB_{t},\quad{\rm or}\quad P(t)=P(0)e^{(\mu-\sigma^{2}/2)t+\sigma B_{t}};\quad t\in[0,T], (4.9)

where {Bt}\{B_{t}\} denotes the standard Brownian motion; and μ>0\mu>0 and σ>0\sigma>0 are the two parameters of the GBM model, representing the rate of return and the volatility of {P(t)}\{P(t)\}. From the second equation in (4.9), we have 𝔼P(t)=P(0)eμt\mathbb{E}P(t)=P(0)e^{\mu t}; hence, P~β(t)=P(0)e(βμ)t\widetilde{P}_{\beta}(t)=P(0)e^{-(\beta-\mu)t}. Then, a decreasing P~β(t)\widetilde{P}_{\beta}(t) corresponds to β>μ\beta>\mu. From (4.6), we can derive

Ψ(t)=N(t)N(t)Ψ(t)eβt,\Psi^{\prime}(t)=-\frac{N^{\prime}(t)}{N(t)}\Psi(t)-\ell e^{-\beta t},

and hence,

(Ψ(t)P~β(t))=N(t)N(t)Ψ(t)eβt+(βμ)P(0)e(βμ)t.\left(\Psi(t)-\widetilde{P}_{\beta}(t)\right)^{\prime}=-\frac{N^{\prime}(t)}{N(t)}\Psi(t)-\ell e^{-\beta t}+(\beta-\mu)P(0)e^{-(\beta-\mu)t}. (4.10)

Let Ψα(t)\Psi_{\alpha}(t) denote Ψ(t)\Psi(t) for N(t)=Nα(t)N(t)=N_{\alpha}(t) defined by (2.1). The following proposition gives the conditions under which Ψα(t)P~β(t)\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t) is monotone in the regime NN\to\infty, and optimal strategies are derived accordingly.

Proposition 4.2.

Suppose the assumptions in Proposition 4.1 hold, with N(t)=Nα(t)N(t)=N_{\alpha}(t) and {P(t)}\{P(t)\} specified by (4.9) with β>μ\beta>\mu. As NN\to\infty, we have the following results:

  • If for some ε>0\varepsilon>0,

    P(0)>1βμ(αheμT(N1α+T)αN1+1α+αβ1N1α+)+εN1α,P(0)>\frac{1}{\beta-\mu}\left(\frac{\alpha he^{-\mu T}(N^{\frac{1}{\alpha}}+T)^{\alpha}}{N^{1+\frac{1}{\alpha}}}+\frac{\alpha\ell\beta^{-1}}{N^{\frac{1}{\alpha}}}+\ell\right)+\frac{\varepsilon}{N^{\frac{1}{\alpha}}}, (4.11)

    then Ψα(t)P~β(t)\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t) is increasing on [0,T][0,T].

  • If for some ε>0\varepsilon>0,

    P(0)<1βμ(αheβTN1α+T+eμT)εN1α,P(0)<\frac{1}{\beta-\mu}\left(\frac{\alpha he^{-\beta T}}{N^{\frac{1}{\alpha}}+T}+\ell e^{-\mu T}\right)-\frac{\varepsilon}{N^{\frac{1}{\alpha}}}, (4.12)

    then Ψα(t)P~β(t)\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t) is decreasing on [0,T][0,T].

Consequently, we have:

  1. (a)

    If P(0)>e(βμ)TΨα(T)P(0)>e^{(\beta-\mu)T}\Psi_{\alpha}(T) and (4.11) holds, or P(0)>Ψα(0)P(0)>\Psi_{\alpha}(0) and (4.12) holds, then ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} for all tt. That is, the participant sells all the time at full capacity.

  2. (b)

    If Ψα(0)P(0)<e(βμ)TΨα(T)\Psi_{\alpha}(0)\leq P(0)<e^{(\beta-\mu)T}\Psi_{\alpha}(T) and (4.11) holds, then ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} for tt0t\leq t_{0} and ν(t)=ν¯\nu_{*}(t)=\overline{\nu} for t>t0t>t_{0}, where t0t_{0} is the unique point of intersection of P~β(t)\widetilde{P}_{\beta}(t) and Ψα(t)\Psi_{\alpha}(t) on [0,T][0,T]. That is, the participant first sells (before t0t_{0}) and then buys (after t0t_{0}), both at full capacity.

  3. (c)

    If e(βμ)TΨα(T)P(0)<Ψα(0)e^{(\beta-\mu)T}\Psi_{\alpha}(T)\leq P(0)<\Psi_{\alpha}(0) and (4.12) holds, then ν(t)=ν¯\nu_{*}(t)=\overline{\nu} for tt0t\leq t_{0} and ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} for t>t0t>t_{0}, where t0t_{0} is the unique point of intersection of P~β(t)\widetilde{P}_{\beta}(t) and Ψα(t)\Psi_{\alpha}(t) on [0,T][0,T]. That is, the participant first buys (before t0t_{0}) and then sells (after t0t_{0}), both at full capacity.

  4. (d)

    If P(0)<e(βμ)TΨα(T)P(0)<e^{(\beta-\mu)T}\Psi_{\alpha}(T) and (4.12) holds, or P(0)<Ψα(0)P(0)<\Psi_{\alpha}(0) and (4.11) holds, then ν(t)=ν\nu_{*}(t)=\nu for all tt. That is, the participant buys all the time at full capacity.

Proof.

Note that Nα(t)Nα(t)=α(N1α+t)1\frac{N^{\prime}_{\alpha}(t)}{N_{\alpha}(t)}=\alpha(N^{\frac{1}{\alpha}}+t)^{-1}, and

tTeβsNα(s)𝑑s=eβN1αβα1(Γ(α+1,β(N1α+t))Γ(α+1,β(N1α+T))),\int_{t}^{T}e^{-\beta s}N_{\alpha}(s)ds=e^{\beta N^{\frac{1}{\alpha}}}\beta^{-\alpha-1}\left(\Gamma(\alpha+1,\beta(N^{\frac{1}{\alpha}}+t))-\Gamma(\alpha+1,\beta(N^{\frac{1}{\alpha}}+T))\right),

where Γ(a,x):=xta1et𝑑t\Gamma(a,x):=\int_{x}^{\infty}t^{a-1}e^{-t}dt is the incomplete Gamma function. As NN\to\infty, we have

tTeβsNα(s)𝑑s=β1(eβtNα(t)eβTNα(T))+o(N),\int_{t}^{T}e^{-\beta s}N_{\alpha}(s)ds=\beta^{-1}\left(e^{-\beta t}N_{\alpha}(t)-e^{-\beta T}N_{\alpha}(T)\right)+o(N),

which together with (4.6) and (4.10) implies that

(Ψα(t)P~β(t))=αN1α+t\displaystyle\left(\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)\right)^{\prime}=-\frac{\alpha}{N^{\frac{1}{\alpha}}+t} [heβTNα(T)Nα(t)+β1(eβteβTNα(T)Nα(t))+o(1)]\displaystyle\left[\frac{he^{-\beta T}N_{\alpha}(T)}{N_{\alpha}(t)}+\ell\beta^{-1}\left(e^{-\beta t}-e^{-\beta T}\frac{N_{\alpha}(T)}{N_{\alpha}(t)}\right)+o(1)\right] (4.13)
eβt+(βμ)P(0)e(βμ)t.\displaystyle-\ell e^{-\beta t}+(\beta-\mu)P(0)e^{-(\beta-\mu)t}.

Multiplying the RHS of (4.13) by e(βμ)te^{(\beta-\mu)t}, we get

αN1α+t\displaystyle-\frac{\alpha}{N^{\frac{1}{\alpha}}+t} [heβ(tT)μtNα(T)Nα(t)+β1(eμteβ(tT)μtNα(T)Nα(t))+o(1)]\displaystyle\left[\frac{he^{-\beta(t-T)-\mu t}N_{\alpha}(T)}{N_{\alpha}(t)}+\ell\beta^{-1}\left(e^{-\mu t}-e^{-\beta(t-T)-\mu t}\frac{N_{\alpha}(T)}{N_{\alpha}(t)}\right)+o(1)\right]
eμt+(βμ)P(0).\displaystyle-\ell e^{-\mu t}+(\beta-\mu)P(0).

Clearly, the sum of all the terms above is lower bounded by

(αheμTNα(T)N1+1α+αβ1N1α+)+(βμ)P(0)>(4.11)0,-\left(\frac{\alpha he^{-\mu T}N_{\alpha}(T)}{N^{1+\frac{1}{\alpha}}}+\alpha\ell\beta^{-1}N^{-\frac{1}{\alpha}}+\ell\right)+(\beta-\mu)P(0)\stackrel{{\scriptstyle\eqref{eq:diffinc}}}{{>}}0,

which implies that inf[0,T](Ψα(t)P~β(t))>0\inf_{[0,T]}\left(\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)\right)^{\prime}>0, and hence, Ψα(t)P~β(t)\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t) is increasing.

Moreover, the term is upper bounded by

(αheβTN1α+T+eμT)+(βμ)P(0)<(4.12)0,-\left(\frac{\alpha he^{-\beta T}}{N^{\frac{1}{\alpha}}+T}+\ell e^{-\mu T}\right)+(\beta-\mu)P(0)\stackrel{{\scriptstyle\eqref{eq:diffdec}}}{{<}}0,

which implies that sup[0,T](Ψα(t)P~β(t))<0\sup_{[0,T]}\left(\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)\right)^{\prime}<0, and hence, Ψα(t)P~β(t)\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t) is decreasing.

(a) If P(0)>e(βμ)TΨα(T)P(0)>e^{(\beta-\mu)T}\Psi_{\alpha}(T) and (4.11) holds, then Ψα(T)<P~β(T)\Psi_{\alpha}(T)<\widetilde{P}_{\beta}(T) and Ψα(t)P~β(t)\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t) is increasing. If P(0)>Ψα(0)P(0)>\Psi_{\alpha}(0) and (4.12) holds, then Ψα(0)<P~β(0)\Psi_{\alpha}(0)<\widetilde{P}_{\beta}(0) and Ψα(t)P~β(t)\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t) is decreasing. In both cases, we have Ψα(t)P~β(t)<0\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)<0 for all tt.

(b) (c) (d) follow the same argument as (a). ∎

See Figure 3 below for an illustration of the results in the above proposition. Also note that the connection to the participant’s risk sensitivity as remarked at the end of §4.1 can also be made more explicit when the price process P(t)P(t) follows the GBM model in (4.9), for which we have P~β(t)=P(0)e(βμ)t\widetilde{P}_{\beta}(t)=P(0)e^{-(\beta-\mu)t}. Then, the three cases in Proposition 4.1 correspond to β=μ\beta=\mu (martingale), β<μ\beta<\mu (sub-martingale), and β>μ\beta>\mu (super-martingale). According to the three ranges of β\beta, they can be viewed as representing the participant as risk-neutral, risk-seeking and risk-averse.

Refer to caption
Figure 3. Optimal stake trading with linear (),h()\ell(\cdot),h(\cdot) when P~β(t)=P(0)e(μβ)t\widetilde{P}_{\beta}(t)=P(0)e^{(\mu-\beta)t} and N(t)=Nα(t)N(t)=N_{\alpha}(t).

4.3. Convex utility

It is possible to extend the above results to more general, non-linear utility functions ()\ell(\cdot) and h()h(\cdot), by following the same approach as above that leads to v+(t,x)v^{+}(t,x) and v(t,x)v^{-}(t,x) in (4.2) and (4.4).

Specifically, considering the two cases of xvP~β(t)\partial_{x}v\geq\widetilde{P}_{\beta}(t), and xv<P~β(t)\partial_{x}v<\widetilde{P}_{\beta}(t), we can derive

v+(t,x):=eβTh(γt,x+(T))+tT[eβs(γt,x+(s))ν¯P~β(s)]𝑑s,\displaystyle v^{+}(t,x):=e^{-\beta T}h\big{(}\gamma^{+}_{t,x}(T)\big{)}+\int_{t}^{T}\left[e^{-\beta s}\ell\big{(}\gamma^{+}_{t,x}(s)\big{)}-\overline{\nu}\widetilde{P}_{\beta}(s)\right]ds, (4.14)
v(t,x):=eβTh(γt,x(T))+tT[eβs(γt,x(s))+ν¯P~β(s)]𝑑s;\displaystyle v^{-}(t,x):=e^{-\beta T}h\big{(}\gamma_{t,x}^{-}(T)\big{)}+\int_{t}^{T}\left[e^{-\beta s}\ell\big{(}\gamma^{-}_{t,x}(s)\big{)}+\overline{\nu}\widetilde{P}_{\beta}(s)\right]ds; (4.15)

whereas γt,x+\gamma_{t,x}^{+} and γt,x\gamma_{t,x}^{-} remain the same as in (4.3) and (4.5).

The Ψ\Psi function in (4.6) now splits into two functions: for (t,x)Q:={(t,x):0t<T, 0<x<N(t)}(t,x)\in Q:=\{(t,x):0\leq t<T,\,0<x<N(t)\}, we have

xv+(t,x)=1N(t)(eβTN(T)h(γt,x+(T))+tTeβsN(s)(γt,x+(s))𝑑s):=Ψ+(t,x),\displaystyle\partial_{x}v^{+}(t,x)=\underbrace{\frac{1}{N(t)}\left(e^{-\beta T}N(T)h^{\prime}\big{(}\gamma_{t,x}^{+}(T)\big{)}+\int_{t}^{T}e^{-\beta s}N(s)\ell^{\prime}\big{(}\gamma^{+}_{t,x}(s)\big{)}ds\right)}_{:=\Psi^{+}(t,x)}, (4.16)

and

xv(t,x)=1N(t)(eβTN(T)h(γt,x(T))+tTeβsN(s)(γt,x(s))𝑑s):=Ψ(t,x).\displaystyle\partial_{x}v^{-}(t,x)=\underbrace{\frac{1}{N(t)}\left(e^{-\beta T}N(T)h^{\prime}\big{(}\gamma_{t,x}^{-}(T)\big{)}+\int_{t}^{T}e^{-\beta s}N(s)\ell^{\prime}\big{(}\gamma^{-}_{t,x}(s)\big{)}ds\right)}_{:=\Psi^{-}(t,x)}. (4.17)

Note that both Ψ+\Psi^{+} and Ψ\Psi^{-} depend on xx (as well as on tt), via γt,x+\gamma^{+}_{t,x} and γt,x\gamma^{-}_{t,x}. This dependence makes it necessary to take a closer look at γt,x+\gamma^{+}_{t,x} and γt,x\gamma^{-}_{t,x}, since the x=x(t)x=x(t) involved in both depends on the control ν\nu before (and up to) tt. We have the following cases: for sts\geq t,

ifx=γ0,x+(t),then\displaystyle{\rm if}\;x=\gamma^{+}_{0,x}(t),\;{\rm then} γ++(s):=γt,x+(s)=(ν¯tsduN(u)+ν¯0tduN(u)+xN)N(s),\displaystyle\gamma^{+}_{+}(s):=\gamma^{+}_{t,x}(s)=\left(\overline{\nu}\int_{t}^{s}\frac{du}{N(u)}+\overline{\nu}\int_{0}^{t}\frac{du}{N(u)}+\frac{x}{N}\right)N(s), (4.18)
ifx=γ0,x(t),then\displaystyle{\rm if}\;x=\gamma^{-}_{0,x}(t),\;{\rm then} γ(s):=γt,x(s)=(ν¯tsduN(u)ν¯0tduN(u)+xN)N(s).\displaystyle\gamma^{-}_{-}(s):=\gamma^{-}_{t,x}(s)=\left(-\overline{\nu}\int_{t}^{s}\frac{du}{N(u)}-\overline{\nu}\int_{0}^{t}\frac{du}{N(u)}+\frac{x}{N}\right)N(s). (4.19)

In other words, γ++\gamma^{+}_{+} corresponds to ν=ν¯\nu=\overline{\nu} both before and after tt, whereas γ\gamma^{-}_{-} corresponds to ν=ν¯\nu=-\overline{\nu} both before and after tt. The other two cases are similar:

ifx=γ0,x+(t),then\displaystyle{\rm if}\;x=\gamma^{+}_{0,x}(t),\;{\rm then} γ+(s):=γt,x(s)=(ν¯tsduN(u)+ν¯0tduN(u)+xN)N(s),\displaystyle\gamma^{-}_{+}(s):=\gamma^{-}_{t,x}(s)=\left(-\overline{\nu}\int_{t}^{s}\frac{du}{N(u)}+\overline{\nu}\int_{0}^{t}\frac{du}{N(u)}+\frac{x}{N}\right)N(s), (4.20)
ifx=γ0,x(t),then\displaystyle{\rm if}\;x=\gamma^{-}_{0,x}(t),\;{\rm then} γ+(s):=γt,x+(s)=(ν¯tsduN(u)ν¯0tduN(u)+xN)N(s);\displaystyle\gamma^{+}_{-}(s):=\gamma^{+}_{t,x}(s)=\left(\overline{\nu}\int_{t}^{s}\frac{du}{N(u)}-\overline{\nu}\int_{0}^{t}\frac{du}{N(u)}+\frac{x}{N}\right)N(s); (4.21)

where γ+\gamma^{-}_{+} corresponds to ν=ν¯\nu=\overline{\nu} before (and up to) tt and ν=ν¯\nu=-\overline{\nu} after tt, and γ+\gamma^{+}_{-} corresponds to the other way around.

Substituting these four cases into Ψ+\Psi^{+} and Ψ\Psi^{-} in (4.16) and (4.17) further splits the latter two into four cases:

Ψ++(t):=Ψ+(t,γ0,x+(t)),Ψ(t):=Ψ(t,γ0,x(t));\displaystyle\Psi^{+}_{+}(t):=\Psi^{+}(t,\gamma^{+}_{0,x}(t)),\quad\Psi^{-}_{-}(t):=\Psi^{-}(t,\gamma^{-}_{0,x}(t)); (4.22)
Ψ+(t):=Ψ(t,γ0,x+(t)),Ψ+(t):=Ψ+(t,γ0,x(t)).\displaystyle\Psi^{-}_{+}(t):=\Psi^{-}(t,\gamma^{+}_{0,x}(t)),\quad\Psi^{+}_{-}(t):=\Psi^{+}(t,\gamma^{-}_{0,x}(t)). (4.23)

All four are now functions of tt only, as xx has been replaced by either γ0,x+(t)\gamma^{+}_{0,x}(t) or γ0,x(t)\gamma^{-}_{0,x}(t).

Clearly, from (4.18)-(4.21) above, we have

tγ++(s)=tγ(s)=0,tγ+(s)=2ν¯N(s)N(t)>0,tγ+(s)=2ν¯N(s)N(t)<0.\displaystyle\partial_{t}\gamma^{+}_{+}(s)=\partial_{t}\gamma^{-}_{-}(s)=0,\quad\partial_{t}\gamma^{-}_{+}(s)=\frac{2\overline{\nu}N(s)}{N(t)}>0,\quad\partial_{t}\gamma^{+}_{-}(s)=-\frac{2\overline{\nu}N(s)}{N(t)}<0. (4.24)

Now, suppose ()\ell(\cdot) and h()h(\cdot) are both smooth, convex (and increasing) functions. Hence, ()0\ell^{\prime}(\cdot)\geq 0 and h()0h^{\prime}(\cdot)\geq 0, and both are increasing functions. Then, it is readily verified:

  • (i)

    Both Ψ++(t)\Psi^{+}_{+}(t) and Ψ(t)\Psi^{-}_{-}(t) are decreasing in t[0,T]t\in[0,T], and so is Ψ+(t)\Psi^{+}_{-}(t); whereas Ψ+(t)\Psi^{-}_{+}(t) could be both increasing and decreasing (i.e., non-monotone).

  • (ii)

    Furthermore, Ψ++(t)Ψ(t)\Psi^{+}_{+}(t)\geq\Psi^{-}_{-}(t) for all t[0,T]t\in[0,T].

For instance, for Ψ++(t)\Psi^{+}_{+}(t) in (i), consider

tΨ++(t)\displaystyle\partial_{t}\Psi^{+}_{+}(t) =\displaystyle= eβTN(T)(h′′(γ++(T))tγ++(T)N(t)h(γ++(T))N(t)N2(t))\displaystyle e^{-\beta T}N(T)\left(\frac{h^{{}^{\prime\prime}}(\gamma^{+}_{+}(T))\partial_{t}\gamma^{+}_{+}(T)}{N(t)}-\frac{h^{\prime}(\gamma^{+}_{+}(T))N^{\prime}(t)}{N^{2}(t)}\right) (4.25)
N(t)N2(t)tTeβsN(s)(γ++(s))𝑑seβt(γ++(t))\displaystyle-\frac{N^{\prime}(t)}{N^{2}(t)}\int_{t}^{T}e^{-\beta s}N(s)\ell^{\prime}(\gamma^{+}_{+}(s))ds-e^{-\beta t}\ell^{\prime}(\gamma^{+}_{+}(t))
+1N(t)tTeβsN(s)′′(γ++(s))tγ++(s)ds0,\displaystyle+\frac{1}{N(t)}\int_{t}^{T}e^{-\beta s}N(s)\ell^{{}^{\prime\prime}}(\gamma^{+}_{+}(s))\partial_{t}\gamma^{+}_{+}(s)ds\quad\leq 0,

where 0\leq 0 follows from tγ++()=0\partial_{t}\gamma^{+}_{+}(\cdot)=0 in both the first and last terms on the RHS. The other two cases, tΨ(t)0\partial_{t}\Psi^{-}_{-}(t)\leq 0 and tΨ+(t)0\partial_{t}\Psi^{+}_{-}(t)\leq 0, are similarly verified.

As in the case of linear utility, the properties above can be used to compare against P~β(t)\widetilde{P}_{\beta}(t) to identify the optimal trading strategy. Consider the case of P~β(t)\widetilde{P}_{\beta}(t) being a constant, P~β(t)=P(0)\widetilde{P}_{\beta}(t)=P(0) for all t[0,T]t\in[0,T], as in part (i) of Proposition 4.1. If Ψ++(t)Ψ(t)P(0)\Psi^{+}_{+}(t)\geq\Psi^{-}_{-}(t)\geq P(0) for all t[0,t]t\in[0,t], then the optimal strategy is to buy all the time and at rate ν¯\overline{\nu}. If P(0)Ψ++(t)>Ψ(t)P(0)\geq\Psi^{+}_{+}(t)>\Psi^{-}_{-}(t) for all t[0,t]t\in[0,t], then it is optimal to sell all the time, at full capacity.

On the other hand, since Ψ+\Psi^{+}_{-} corresponds to sell first (before tt) and then buy, this clearly cannot be optimal, as it is impossible for Ψ+P(0)\Psi^{+}_{-}\leq P(0) before tt and Ψ+P(0)\Psi^{+}_{-}\geq P(0) after tt, since Ψ+\Psi^{+}_{-} is decreasing in tt. Similarly, Ψ+\Psi^{-}_{+} corresponds to buy first (before tt) and then sell, which can be optimal provided if Ψ+(t)\Psi^{-}_{+}(t) is decreasing in tt.

The details are stated in the following proposition; and see Figure 4 for an illustration.

Proposition 4.3.

Assume that ()\ell(\cdot) and h()h(\cdot) are twice continuously differentiable, convex, and satisfy the conditions in Assumption 2.1. Assume that P~β(t)\widetilde{P}_{\beta}(t) stays constant, i.e. P~β(t)=P(0)\widetilde{P}_{\beta}(t)=P(0) for all t[0,T]t\in[0,T]. Further assume the condition (4.8), and that tΨ+(t)t\to\Psi^{-}_{+}(t) is decreasing then

  1. (a)

    If P(0)Ψ++(T)Ψ(0,x)P(0)\geq\Psi^{+}_{+}(T)\vee\Psi^{-}(0,x), then ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} for all 0tT0\leq t\leq T. That is, the participant sells at all time at full capacity.

  2. (b)

    If P(0)Ψ+(T)P(0)\leq\Psi^{-}_{+}(T), then ν(t)=ν¯\nu_{*}(t)=\overline{\nu} for all 0tT0\leq t\leq T. That is, the participant buys at all time at full capacity.

  3. (c)

    If Ψ++(T)<Ψ(0,x)\Psi^{+}_{+}(T)<\Psi^{-}(0,x) and Ψ+(T)<P(0)<Ψ(0,x)\Psi^{-}_{+}(T)<P(0)<\Psi^{-}(0,x), then

    ν(t)={ν¯for tt0,ν¯for t>t0,\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{0},\\ -\overline{\nu}&\mbox{for }t>t_{0},\end{array}\right.

    where t0t_{0} is the unique point in [0,T][0,T] such that Ψ+(t)=P(0)\Psi^{-}_{+}(t)=P(0). That is, the participant first buys and after some time sells, both at full capacity.

  4. (d)

    If Ψ(0,x)<Ψ++(T)\Psi^{-}(0,x)<\Psi^{+}_{+}(T), then

    1. (1)

      if Ψ(0,x)<P(0)<Ψ++(T)\Psi^{-}(0,x)<P(0)<\Psi^{+}_{+}(T), then ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} for all 0tT0\leq t\leq T. That is, the participant sells at all time at full capacity.

    2. (2)

      if Ψ+(T)<P(0)Ψ(0,x)\Psi^{-}_{+}(T)<P(0)\leq\Psi^{-}(0,x), then then

      ν(t)={ν¯for tt0,ν¯for t>t0,\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{0},\\ -\overline{\nu}&\mbox{for }t>t_{0},\end{array}\right.

      where t0t_{0} is the unique point in [0,T][0,T] such that Ψ+(t)=P(0)\Psi^{-}_{+}(t)=P(0). That is, the participant first buys and after some time sells, both at full capacity.

Refer to caption
Refer to caption
Figure 4. Optimal stake trading with convex (),h()\ell(\cdot),h(\cdot) when P~β(t)\widetilde{P}_{\beta}(t) is constant, and Ψ++(T)<Ψ(0,x)\Psi^{+}_{+}(T)<\Psi^{-}(0,x) (left) and Ψ++(T)>Ψ(0,x)\Psi^{+}_{+}(T)>\Psi^{-}(0,x) (right).

5. Extension: Risk Control

In the previous sections, we have focused on profit seeking objectives in which a participant’s utility increases with getting more stakes, or consuming more. In the modern finance literature, Markowitz [16] pioneered the idea of balancing return and risk in any investment, which is particularly relevant for cryptocurrency trading, which often involves substantial volatility. In this spirit, here we add to the utility objective two “cost” terms that penalize the deviation of participant kk’s holding of stakes from the average of all others. The idea is, to extent this deviation measures risk (analogous to the variance in the Markowitz model), it should be the price to be paid for the utility (in holding stakes) that kk wants to maximize. (The same idea has been used in [13] in the context of stochastic games.) Specifically, the deviation of participant kk’s holding from the average all others can be expressed as |Xk(t)N(t)K||X_{k}(t)-\frac{N(t)}{K}|, taking into account N(t)=k=1KXk(t)N(t)=\sum_{k=1}^{K}X_{k}(t). Hence, the new objective function is:

U(x):=sup{ν(t),b(t)}\displaystyle U(x):=\sup_{\{\nu(t),b(t)\}} J(ν,b):=𝔼{0𝒯eβt(dc(t)+(X(t))dt)+eβ𝒯(b(𝒯)+h(X(𝒯))\displaystyle J(\nu,b):=\mathbb{E}\bigg{\{}\int_{0}^{\mathcal{T}}e^{-\beta t}(dc(t)+\ \ell(X(t))dt)+e^{-\beta\mathcal{T}}\left(b(\mathcal{T})+h(X(\mathcal{T})\right)
0𝒯eδtg(X(t)N(t)K)dteδ𝒯q(X(𝒯)N(𝒯)K)}\displaystyle\qquad-\int_{0}^{\mathcal{T}}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt-e^{-\delta\mathcal{T}}q\left(X(\mathcal{T})-\frac{N(\mathcal{T})}{K}\right)\bigg{\}} (5.1)
subject to X(t)=ν(t)+N(t)N(t)X(t),X(0)=x,\displaystyle\mbox{ subject to }X^{\prime}(t)=\nu(t)+\frac{N^{\prime}(t)}{N(t)}X(t),\,X(0)=x, (C0)
dc(t)+db(t)rb(t)dt+P(t)ν(t)dt=0,\displaystyle\qquad\qquad\quad\,dc(t)+db(t)-rb(t)dt+P(t)\nu(t)dt=0, (C1)
b(0)=0,b(t)0 and 0X(t)N(t),\displaystyle\qquad\qquad\quad\,b(0)=0,\,b(t)\geq 0\mbox{ and }0\leq X(t)\leq N(t), (C2)
|ν(t)|ν¯,\displaystyle\qquad\qquad\quad\,|\nu(t)|\leq\overline{\nu}, (C3)

where δ>0\delta>0 is a discount factor (which may or may not be equal to β\beta), and g:+g:\mathbb{R}\to\mathbb{R}_{+} and q:+q:\mathbb{R}\to\mathbb{R}_{+} are symmetric, and increasing on +\mathbb{R}_{+} (a typical example is g(x)=gx2g(x)=gx^{2} and q(x)=qx2q(x)=qx^{2} with g,q>0g,q>0).

The theorem below follows the same argument as Theorem 3.4.

Theorem 5.1.

Let the assumptions in Theorem 3.4 hold for the problem (5). Assume that g,q𝒞1()g,q\in\mathcal{C}^{1}(\mathbb{R}) are symmetric, and increasing on +\mathbb{R}_{+}. Then U(x)=v(0,x)U(x)=v(0,x) where v(t,x)v(t,x) is the unique viscosity solution to the following HJB equation:

{tv+eβt(x)eδtg(xN(t)K)+xN(t)N(t)xv+sup|ν|ν¯{ν(xvP~β(t))}=0in Q,v(T,x)=eβTh(x)eδTq(xN(T)K),v(t,0)=eβth(0)eδtq(N(t)K),v(t,N(t))=eβth(N(t))eδtq((K1)N(t)K).\left\{\begin{array}[]{lcl}\partial_{t}v+e^{-\beta t}\ell(x)-e^{-\delta t}g\left(x-\frac{N(t)}{K}\right)+\frac{xN^{\prime}(t)}{N(t)}\partial_{x}v+\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\}=0\quad\mbox{in }Q,\\ v(T,x)=e^{-\beta T}h(x)-e^{-\delta T}q\left(x-\frac{N(T)}{K}\right),\\ v(t,0)=e^{-\beta t}h(0)-e^{-\delta t}q\left(\frac{N(t)}{K}\right),\,\,v(t,N(t))=e^{-\beta t}h(N(t))-e^{-\delta t}q\left(\frac{(K-1)N(t)}{K}\right).\end{array}\right. (5.2)

Moreover, the optimal strategy is b(t)=0b_{*}(t)=0 and ν(t)=ν(t,X(t))\nu_{*}(t)=\nu_{*}(t,X_{*}(t)) for 0t𝒯0\leq t\leq\mathcal{T}_{*} (if it exists), where ν(t,x)\nu_{*}(t,x) achieves the supremum in (3.19), and X(t)X_{*}(t) solves X(t)=ν(t,X(t))+N(t)N(t)X(t)X_{*}^{\prime}(t)=\nu_{*}(t,X_{*}(t))+\frac{N^{\prime}(t)}{N(t)}X_{*}(t) with X(0)=xX_{*}(0)=x, and 𝒯:=inf{t>0:X(t)=0 or N(t)}T\mathcal{T}_{*}:=\inf\{t>0:X_{*}(t)=0\mbox{ or }N(t)\}\wedge T.

In general, the HJB equation (5.2) does not have a closed-form solution even when ,h\ell,h are linear, and g,qg,q are quadratic. Again it requires numerical methods to solve the HJB equation, and then find the optimal strategy ν\nu_{*}. Nevertheless, there is one exception where the participant is only concerned with the risk entailed by the stakes. The objective is to solve the stake parity problem:

U(x):=infν(t)\displaystyle U(x):=\inf_{\nu(t)} J(ν):=0𝒯eδtg(X(t)N(t)K)𝑑t+eδ𝒯q(X(𝒯)N(𝒯)K)\displaystyle J(\nu):=\int_{0}^{\mathcal{T}}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+e^{-\delta\mathcal{T}}q\left(X(\mathcal{T})-\frac{N(\mathcal{T})}{K}\right) (5.3)
subject to X(t)=ν(t)+N(t)N(t)X(t),X(0)=x,\displaystyle\mbox{ subject to }X^{\prime}(t)=\nu(t)+\frac{N^{\prime}(t)}{N(t)}X(t),\,X(0)=x, (C0)
b(0)=0,b(t)0 and 0X(t)N(t),\displaystyle\qquad\qquad\quad\,b(0)=0,\,b(t)\geq 0\mbox{ and }0\leq X(t)\leq N(t), (C2’)
|ν(t)|ν¯.\displaystyle\qquad\qquad\quad\,|\nu(t)|\leq\overline{\nu}. (C3)

Since g,hg,h attain the minimum at 0, if xN/Kx\geq N/K, then the participant sells at full capacity until hitting the average N(t)/KN(t)/K; if if x<N/Kx<N/K, then the participant purchases at full capacity until hitting the average N(t)/KN(t)/K. We record this simple fact in the following proposition.

Proposition 5.2.

Assume that g,q𝒞1()g,q\in\mathcal{C}^{1}(\mathbb{R}) are symmetric, and increasing on +\mathbb{R}_{+} for the stake parity problem (5.3). Let γ+(t)\gamma_{+}(t) be defined by (3.7), and

γ(t):=ν¯N(t)0tdsN(s)+xN(t)Nfor 0tT,\gamma_{-}(t):=-\overline{\nu}N(t)\int_{0}^{t}\frac{ds}{N(s)}+\frac{xN(t)}{N}\quad\mbox{for }0\leq t\leq T, (5.4)

and

t±:=inf{t>0:ν¯0tdsN(s)=±(1KxN)}.t_{\pm}:=\inf\left\{t>0:\overline{\nu}\int_{0}^{t}\frac{ds}{N(s)}=\pm\left(\frac{1}{K}-\frac{x}{N}\right)\right\}. (5.5)

Then, the following results hold.

  1. (i)

    If x>N(1K+ν¯0TdtN(t))x>N\left(\frac{1}{K}+\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right), then the optimal strategy is ν(t)=ν¯\nu_{*}(t)=-\overline{\nu} for all 0tT0\leq t\leq T, and U(x)=0Teδtg(γ(t)N(t)K)𝑑t+eδTq(γ(T)N(T)K)U(x)=\int_{0}^{T}e^{-\delta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(\gamma_{-}(T)-\frac{N(T)}{K}\right).

  2. (ii)

    If NK<xN(1K+ν¯0TdtN(t))\frac{N}{K}<x\leq N\left(\frac{1}{K}+\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right), then the optimal strategy is

    ν(t)={ν¯for tt,0for t>t,\nu_{*}(t)=\left\{\begin{array}[]{lcl}-\overline{\nu}&\mbox{for }t\leq t_{-},\\ 0&\mbox{for }t>t_{-},\end{array}\right.

    and U(x)=0teβtg(γ(t)N(t)K)𝑑t+g(0)δ(eδteδT)+eδTq(0)U(x)=\int_{0}^{t_{-}}e^{-\beta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+\frac{g(0)}{\delta}\left(e^{-\delta t_{-}}-e^{-\delta T}\right)+e^{-\delta T}q(0).

  3. (iii)

    If N(1Kν¯0TdtN(t))x<NKN\left(\frac{1}{K}-\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right)\leq x<\frac{N}{K}, then the optimal strategy is

    ν(t)={ν¯for tt,0for t>t,\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{-},\\ 0&\mbox{for }t>t_{-},\end{array}\right.

    and U(x)=0t+eβtg(γ(t)N(t)K)𝑑t+g(0)δ(eδteδT)+eδTq(0)U(x)=\int_{0}^{t_{+}}e^{-\beta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+\frac{g(0)}{\delta}\left(e^{-\delta t_{-}}-e^{-\delta T}\right)+e^{-\delta T}q(0).

  4. (iv)

    If x<N(1Kν¯0TdtN(t))x<N\left(\frac{1}{K}-\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right), the the optimal strategy is ν(t)=ν¯\nu_{*}(t)=\overline{\nu} for all 0tT0\leq t\leq T, and U(x)=0Teδtg(γ+(t)N(t)K)𝑑t+eδTq(γ+(T)N(T)K)U(x)=\int_{0}^{T}e^{-\delta t}g\left(\gamma_{+}(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(\gamma_{+}(T)-\frac{N(T)}{K}\right).

Proof.

(i) If x>N(1K+ν¯0TdtN(t))x>N\left(\frac{1}{K}+\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right), we have γ(t)>N(t)/K\gamma_{-}(t)>N(t)/K for all 0tT0\leq t\leq T. By a comparison argument, we get X(t)γ(t)X(t)\geq\gamma_{-}(t) for all 0tT0\leq t\leq T given any feasible strategy ν(t)\nu(t). Since g,qg,q are increasing on +\mathbb{R}_{+}, we obtain

0Teδtg(X(t)N(t)K)𝑑t+eδTq(X(T)N(T)K)\displaystyle\int_{0}^{T}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(X(T)-\frac{N(T)}{K}\right)
\displaystyle\geq 0Teδtg(γ(t)N(t)K)𝑑t+eδTq(γ(T)N(T)K),\displaystyle\int_{0}^{T}e^{-\delta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(\gamma_{-}(T)-\frac{N(T)}{K}\right),

which yields the desired result.

(ii) If NK<xN(1K+ν¯0TdtN(t))\frac{N}{K}<x\leq N\left(\frac{1}{K}+\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right), we have γ(t)>N(t)/K\gamma_{-}(t)>N(t)/K for 0t<t0\leq t<t_{-} and γ(t)=N(t)/K\gamma_{-}(t_{-})=N(t_{-})/K. Again by the comparison argument, X(t)γ(t)X(t)\geq\gamma_{-}(t) for 0tt0\leq t\leq t_{-} given any strategy. Thus,

0Teδtg(X(t)N(t)K)𝑑t+eδTq(X(T)N(T)K)\displaystyle\quad\int_{0}^{T}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(X(T)-\frac{N(T)}{K}\right)
=0teδtg(X(t)N(t)K)𝑑t+tTeδtg(X(t)N(t)K)𝑑t+eδTq(X(T)N(T)K)\displaystyle=\int_{0}^{t_{-}}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+\int_{t_{-}}^{T}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(X(T)-\frac{N(T)}{K}\right)
0teδtg(γ(t)N(t)K)𝑑t+g(0)tTeδt𝑑t+eδTg(0),\displaystyle\geq\int_{0}^{t_{-}}e^{-\delta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+g(0)\int_{t_{-}}^{T}e^{-\delta t}dt+e^{\delta T}g(0),

which permits to conclude.

(iii) and (iv) follow the same argument as (1) and (2). ∎

6. Conclusion

We have developed in this paper a continuous-time control approach to the optimal trading under the PoS protocol, formulated as a consumption-investment problem. We present general solutions to the optimal control via dynamic programming and the HJB equations, and in the case of linear and utility functions, close-form solutions in the form of bang-bang controls. Furthermore, we bring out the explicit connections between the rate of return in trading/holding stakes and the participant’s risk-adjusted valuation of the stakes, such that the participant’s risk sensitivity is explicitly accounted for in the trading strategy. We have also studied a risk-control version of the consumption-investment problem, and for a special case, the “stake-parity” problem, we show a mean-reverting strategy is the optimal solution.

While our focus here is entirely on an individual participant’s trading strategy in a PoS protocol, it is possible to study the interactions among the participants, and formulate the problem of trading in a PoS protocol as a game (deterministic or stochastic), and to study issues such as equilibrium, social welfare, and the inclusion of a trusted third party (or market maker). This will be our focus of a follow-up paper.


Acknowledgement: W. Tang gratefully acknowledges financial support through NSF grants DMS-2113779 and DMS-2206038, and through a start-up grant at Columbia University. David Yao’s work is part of a Columbia-CityU/HK collaborative project that is supported by InnotHK Initiative, The Government of the HKSAR and the AIFT Lab.

References

  • [1] H. Alsabah and A. Capponi. Pitfalls of Bitcoin’s Proof-of-Work: R&D arms race and mining centralization. 2020. SSRN:3273982.
  • [2] L. Ambrosio. Transport equation and Cauchy problem for non-smooth vector fields. In Calculus of variations and nonlinear partial differential equations, volume 1927 of Lecture Notes in Math., pages 1–41. Springer, Berlin, 2008.
  • [3] N. Arnosti and S. M. Weinberg. Bitcoin: A natural oligopoly. Management Science, 2022.
  • [4] C. Bertucci, L. Bertucci, J.-M. Lasry, and P.-L. Lions. Mean field game approach to Bitcoin mining. 2020. arXiv:2004.08167.
  • [5] C. Bertucci, L. Bertucci, J.-M. Lasry, and P.-L. Lions. How resilient is the Bitcoin protocol? 2022. SSRN:3907822.
  • [6] V. Buterin. Toward a 1212-second block time. 2014. Available at https://blog.ethereum.org/2014/07/11/toward-a-12-second-block-time.
  • [7] J. Chiu and T. V. Koeppl. The economics of cryptocurrencies–Bitcoin and beyond. 2017. SSRN:3048124.
  • [8] J. Chod, N. Trichakis, G. Tsoukalas, H. Aspegren, and M. Weber. On the financing benefits of supply chain transparency and blockchain adoption. Management Science, 66(10):4378–4396, 2020.
  • [9] F. Donovan. Healthcare blockchain could save industry $100b annually by 2025. HIT Infrastructure, 2019. Available at https://hitinfrastructure.com/news/healthcare-blockchain-could-save-industry-100b-annually-by-2025.
  • [10] W. Duggan and F. Powell. What is Ethereum 2.0? understanding the merge. Avalialbe at https://www.forbes.com/uk/advisor/investing/cryptocurrency/what-is-ethereum-2/, year=2022,.
  • [11] W. H. Fleming and H. M. Soner. Controlled Markov processes and viscosity solutions, volume 25 of Stochastic Modelling and Applied Probability. Springer, New York, second edition, 2006.
  • [12] F. Golse. Mean field kinetic equations. 2013. Available at http://www.cmls.polytechnique.fr/perso/golse/M2/PolyKinetic.pdf.
  • [13] X. Guo, W. Tang, and R. Xu. A class of stochastic games and moving free boundary problems. SIAM J. Control Optim., 60(2):758–785, 2022.
  • [14] S. King and S. Nadal. Ppcoin: Peer-to-peer crypto-currency with proof-of-stake. 2012. Available at https://decred.org/research/king2012.pdf.
  • [15] Z. Li, A. M. Reppen, and R. Sircar. A mean field games model for cryptocurrency mining. 2019. arXiv:1912.01952.
  • [16] H. M. Markowitz. Portfolio selection: Efficient diversification of investments. Cowles Foundation for Research, Monograph 16. John Wiley & Sons, Inc., New York, 1959.
  • [17] C. Mora, R. L. Rollins, K. Taladay, M. B. Kantar, M. K. Chock, M. Shimada, and E. C. Franklin. Bitcoin emissions alone could push global warming above 2 c. Nat. Clim. Change, 8(11):931–933, 2018.
  • [18] S. Nakamoto. Bitcoin: A peer-to-peer electronic cash system. Decentralized Business Review, page 21260, 2008.
  • [19] S. Osher and C.-W. Shu. High-order essentially nonoscillatory schemes for Hamilton-Jacobi equations. SIAM J. Numer. Anal., 28(4):907–922, 1991.
  • [20] M. Platt, J. Sedlmeir, D. Platt, P. Tasca, J. Xu, N. Vadgama, and J. I. Ibañez. Energy footprint of blockchain consensus mechanisms beyond proof-of-work. 2021. arXiv:2109.03667.
  • [21] I. Roşu and F. Saleh. Evolution of shares in a proof-of-stake cryptocurrency. Manag. Sci., 67(2):661–672, 2021.
  • [22] F. Saleh. Blockchain without waste: Proof-of-stake. The Review of Financial Studies, 34(3):1156–1190, 2021.
  • [23] P. E. Souganidis. Approximation schemes for viscosity solutions of Hamilton-Jacobi equations. J. Differential Equations, 59(1):1–43, 1985.
  • [24] W. Tang. Stability of shares in the Proof of Stake protocol – concentration and phase transitions. 2022. arXiv:2206.02227.
  • [25] W. Tang and D. D. Yao. Polynomial voting rules. 2022. arXiv:2206.10105.
  • [26] Q. Wang, R. Li, Q. Wang, and S. Chen. Non-fungible token (NFT): Overview, evaluation, opportunities and challenges. 2021. arXiv:2105.07447.
  • [27] A. Wood. West Virginia secretary of state reports successful blockchain voting in 2018 midterm elections. 2018. Avalialbe at https://cointelegraph.com/news/west-virginia-secretary-of-state-reports-successful-blockchain-voting-in-2018-midterm-elections.
  • [28] G. Wood. Ethereum: A secure decentralised generalised transaction ledger. Ethereum Project Yellow Paper, 151:1–32, 2014.