Trading under the Proof-of-Stake Protocol
– a Continuous-Time Control Approach

Wenpin Tang Department of Industrial Engineering and Operations Research, Columbia University. [email protected] and David D. Yao Department of Industrial Engineer and Operations Research, Columbia University. [email protected]

Abstract.

We develop a continuous-time control approach to optimal trading in a Proof-of-Stake (PoS) blockchain, formulated as a consumption-investment problem that aims to strike the optimal balance between a participant’s (or agent’s) utility from holding/trading stakes and utility from consumption. We present solutions via dynamic programming and the Hamilton-Jacobi-Bellman (HJB) equations. When the utility functions are linear or convex, we derive close-form solutions and show that the bang-bang strategy is optimal (i.e., always buy or sell at full capacity). Furthermore, we bring out the explicit connection between the rate of return in trading/holding stakes and the participant’s risk-adjusted valuation of the stakes. In particular, we show when a participant is risk-neutral or risk-seeking, corresponding to the risk-adjusted valuation being a martingale or a sub-martingale, the optimal strategy must be to either buy all the time, sell all the time, or first buy then sell, and with both buying and selling executed at full capacity. We also propose a risk-control version of the consumption-investment problem; and for a special case, the “stake-parity” problem, we show a mean-reverting strategy is optimal.

Key words: Consumption-investment, Proof of Stake (PoS) protocol, cryptocurrency, dynamic programming, HJB equations, continuous-time control, risk control.

1. Introduction

As a digital exchange vehicle, blockchain technology has been successfully deployed in many applications including cryptocurrency [18], healthcare [9], supply chain [8], electoral voting [27], and non-fungible tokens [26]. A blockchain is a growing chain of accounting records, called blocks, which are jointly maintained by participants of the system using cryptography. Consider for instance Bitcoin – a peer to peer decentralized payment system. In contrast to traditional payment processing networks, Bitcoin provides a permissionless environment in which everyone is free to participate. At the core of Bitcoin is the consensus protocol known as Proof of Work (PoW), in which “miners” compete with each other by solving a hashing puzzle so as to validate an ever-growing log of transactions (the “longest chain”) to update a distributed ledger; and the miner who solves the puzzle first receives a reward (a number of coins). Thus, while the competition is open to all participants, the chance of winning is proportional to a miner’s computing power.

Despite its popularity, the PoW protocol has some obvious drawbacks. Competition among miners has led to exploding levels of energy consumption in Bitcoin mining, [17, 20]. [1, 3, 7] pointed out that PoW mining will lead to centralization, violating the core tenet of decentralization. To solve the problem of energy efficiency, [14, 28] introduced another consensus protocol – Proof of Stake (PoS), which is a bidding mechanism to select a miner to validate the new block. Participants who choose to join the bidding process are required to commit certain stakes (coins they own), and the winning probability is proportional to the stakes committed. Hence, a participant in a PoS blockchain is a “bidder”, and only the winning bidder becomes the miner who does the validation. As yet the PoS protocol has not been as popular as PoW. However, it is catching up quickly, and blockchain developers have strong incentives to switch from a PoW to a PoS ecosystem. A prominent case in this direction is Ethereum 2.0, where two parallel chains – Mainnet (PoW) and Beacon Chain (PoS) are expected soon to merge into one unified PoS blockchain [10].

There has been an active stream of recent studies on PoS in the research literature; and here we briefly mention several that relate closely to our study. In [22] it is shown that the PoS protocol is “without waste” from an economic standpoint. Issues of stability and decentralization of the PoS protocol are examined in [21, 24]. Specifically, it is shown in [21] that for large owners of initial wealth in a PoS system their shares of the total wealth will remain stable in the long run (i.e., proportions to the total wealth will remain constant), and hence the rich-get-richer phenomenon will not happen. [24] further extends this to medium and small participants, and reveals a phase transition in share stability among those different types of participants. In [21, 25], various aspects of the consumption-investment problem in PoS are examined, and certain conditions are identified under which a participant may have no incentive to trade with others. This leads to the complementing question, given a participant does prefer to trade, what is the optimal trading strategy?

Motivated by the above question, the objective of our study here is to develop a continuous-time control approach to optimal trading in a PoS blockchain. While the control (or game) approach has been proposed in previous studies [4, 5, 15], they are all for the PoW protocol. To the best of our knowledge, ours is the first control model developed for optimal trading under the PoS protocol.

Here is an overview of our main results. We first formulate the consumption-investment problem, which aims to strike a balance between a participant’s utility from holding/trading stakes and utility from consumption. It takes the form of a deterministic control problem with the real-time trading strategy being the control variable. We start with a detailed analysis on a special case that we call the “stake-hoarding” problem (Proposition 3.1), where we bring out the possible scenario of monopoly. We then solve the general consumption-investment problem via dynamic programming and the Hamilton-Jacobi-Bellman (HJB) equations (Theorem 3.4).

When the utility functions are linear or convex, more explicit solutions can be obtained, and we show that the bang-bang control is optimal, i.e., always buy or sell at full capacity (Propositions 4.1 and 4.3). Along with the optimal trading strategy, we are also able to bring out the explicit connection between the rate of return in trading/holding stakes and the participant’s risk-adjusted valuation of the stakes. In other words, the participant’s risk sensitivity is explicitly accounted for in the trading strategy. In particular, when a participant is risk-neutral or risk-seeking, corresponding to the risk-adjusted valuation being a martingale or a sub-martingale, the optimal strategy must be either buy all the time, sell all the time, or first buy then sell (with both buying and selling executed at full capacity).

Finally, we propose a risk control version of the consumption-investment problem, by adding a penalty term to control the level of stake holding so as to reduce the level of concentration risk (Theorem 5.1). A special case is a “stake-parity” problem, where the participant’s holding is controlled at a level that tries to track the system-wide average. We show that the “mean-reverting” strategy is the optimal solution to the stake-parity problem (Proposition 5.2).

The rest of the paper is organized as follows. Section 2 details the formulation of the consumption-investment problem under the PoS protocol. Section 3 presents the optimal solution to the problem, and Section 4 focuses on the special case of linear and convex utility functions. Section 5 presents extensions to risk-control objectives. Concluding remarks are summarized in Section 6.

2. Model Formulation

This section introduces the problem of trading under the PoS protocol in continuous time, and formulate a control model to solve the problem. First, collected below are some conventions that will be used throughout this paper.

–

$\mathbb{R}$ denotes the set of real numbers, and $\mathbb{R}_{+}$ denotes the set of nonnegative real numbers.
–

For $x,y\in\mathbb{R}$ , $x\wedge y$ denotes the smaller number of $x$ and $y$ ; $x\vee y$ denotes the larger number of $x$ and $y$ .
–

The symbol $x=o(y)$ means $\frac{x}{y}$ decays towards zero as $y\to\infty$ .
–

For a random variable $X$ , $\mathbb{E}(X)$ denotes the expectation of $X$ .
–

Let $\Omega$ be a subset of $\mathbb{R}$ . A function $f\in\mathcal{C}^{k}(\Omega)$ if it is $k$ -time continuously differentiable in $\Omega$ .
–

For $f\in\mathcal{C}^{1}([0,T])$ , $f^{\prime}(t)$ denotes the derivative of $f$ . For $f\in\mathcal{C}^{1}([0,T]\times\Omega)$ , $\partial_{t}f$ (resp. $\partial_{x}f$ ) denotes the partial derivative of $f$ with respect to $t$ (resp. $x$ ).

Time is continuous, indexed by $t\in[0,T]$ , for a fixed $T>0$ representing the length of a finite horizon. Let $\{N(t),\,0\leq t\leq T\}$ (with $N(0):=N$ ) denote the process of the total volume of stakes, which are issued over time by the PoS protocol, and can either be deterministic or stochastic. For ease of presentation, we consider a deterministic process $N(t)$ , which is increasing in time and sufficiently smooth, with the derivative $N^{\prime}(t)$ representing the instantaneous rate of “reward” — additional stakes (or “coins”) injected into the system specified (exogenously) by the PoS protocol. For instance, we will consider below, as a special case, the process $N(t)$ of a polynomial form:

N_{\alpha}(t)=(N^{\frac{1}{\alpha}}+t)^{\alpha},\qquad t\geq 0.

(2.1)

Then, $N^{\prime}_{\alpha}(t)=\alpha(N^{\frac{1}{\alpha}}+t)^{\alpha-1}$ , and $N^{\prime\prime}_{\alpha}(t)=\alpha(\alpha-1)(N^{\frac{1}{\alpha}}+t)^{\alpha-2}$ , so the parametric family (2.1) covers different rewarding schemes according to the values of $\alpha$ .

•

For $0<\alpha<1$ , we have $N^{\prime\prime}_{\alpha}(t)<0$ so the process $N_{\alpha}(t)$ corresponds to a decreasing reward (e.g. Bitcoin);
•

For $\alpha=1$ , the process $N_{1}(t)=N+t$ gives a rate one constant reward (e.g. Blackcoin);
•

For $\alpha>1$ , we get $N^{\prime\prime}_{\alpha}(t)>0$ and hence, the process $N_{\alpha}(t)$ amounts to an increasing reward (e.g. EOS).

Let $K\geq 2$ denote the total number of participants in the system, who are indexed by $k\in[K]:=\{1,\ldots,K\}$ . For each participant $k$ , let $\{X_{k}(t),\,0\leq t\leq T\}$ (with $X_{k}(0)=x_{k}$ ) denote the process of the number of stakes that participant $k$ holds, with $X_{k}(t)\geq 0$ and $\sum_{k=1}^{K}X_{k}(t)=N(t)$ for all $t\in[0,T]$ . In the (discrete-time) PoS protocol, in each round of the bidding process, individual participants commit stakes so as to be selected to validate the block and receive a reward; and the winning probability is $X_{k}(t)/N(t)$ for participant $k$ , i.e., proportional to the number of stakes committed. (For instance, each round in Ethereum takes about $10$ seconds, corresponding to the block-generation time [6].) For our continuous-time PoS model here, in which the time required for each round of voting is “infinitesimal,” imagine there are $M$ rounds of bidding during any given time interval $[t,t+\Delta t]$ . In each round participant $k$ gets either some stake(s) or nothing; so the average total number of stakes $k$ will get over the $M$ rounds is (by law of large numbers when $M$ is large),

\underbrace{\frac{X_{k}(t)}{N(t)}\frac{N^{\prime}(t)\Delta t}{M}}_{\tiny\mbox{average number of stakes in each round}}\times\underbrace{M}_{\tiny\mbox{number of rounds}}=\quad\frac{X_{k}(t)}{N(t)}N^{\prime}(t)\Delta t.

Hence, replacing $\Delta t$ by the infinitesimal $dt$ , we know participant $k$ will receive (on average) $\frac{X_{k}(t)}{N(t)}N^{\prime}(t)dt$ stakes, where $\frac{X_{k}(t)}{N(t)}$ is $k$ ’s winning probability, and $N^{\prime}(t)dt$ is the reward issued by the blockchain in $[t,t+dt]$ .

Participants are allowed to trade (buy or sell) their stakes. Participant $k$ will buy $\nu_{k}(t)dt$ stakes in $[t,t+dt]$ if $\nu_{k}(t)>0$ , and sell $-\nu_{k}(t)dt$ stakes if $\nu_{k}(t)<0$ . This leads to the following dynamics of participant $k$ ’s stakes under trading:

X^{\prime}_{k}(t)=\nu_{k}(t)+\frac{N^{\prime}(t)}{N(t)}X_{k}(t)\quad\mbox{for }0\leq t\leq\tau_{k}\wedge T:=\mathcal{T}_{k},

(2.2)

where $\tau_{k}:=\inf\{t>0:X_{k}(t)=0\}$ is the first time at which the process $X_{k}(t)$ reaches zero. It is reasonable to stop the trading process if a participant runs out of stakes, or gets all available stakes:

•

If $\mathcal{T}_{k}=\tau_{k}$ , then participant $k$ liquidates all his stakes by time $\tau_{k}$ , and $X_{k}(\mathcal{T}_{k})=0$ ;
•

If $\mathcal{T}_{k}=\max_{j\neq k}\tau_{j}$ , then participant $k$ gets all issued stakes by time $\max_{j\neq k}\tau_{j}$ , and hence $X_{k}(\mathcal{T}_{k})=N(\mathcal{T}_{k})$ .

We set $X_{k}(t)=X_{k}(\mathcal{T}_{k})$ for $t>\mathcal{T}_{k}$ .

The problem is for each participant $k$ to decide how to trade stakes with others under the PoS protocol. Let $\{P(t),\,0\leq t\leq T\}$ be the price process of each (unit of) stake, which is a stochastic process assumed to be independent of the dynamics in (2.2). (This assumption has appeared in recent studies (e.g., [21]), and is somehow a reflection of the reality that the crypto price tends to be affected by market shocks such as macroeconomics, geopolitics, breaking news, etc much more than by trading activities.) Here, the price $P(t)$ of each stake is measured in terms of an underlying risk-free asset (referred to as “cash” for simplicity); and let $b_{k}(t)$ denote the (units of) risk-free asset that participant $k$ holds at time $t$ , and let $r>0$ denote the risk-free (interest) rate. Also note that all $K$ participants are allowed to trade stakes (with cash) only internally among themselves, whereas each participants can only exchange cash with an external source (say, a bank).

The decision for each participant $k$ at $t$ is hence a tuple $(\nu_{k}(t),b_{k}(t))$ . Let $\{c_{k}(t),\,0\leq t\leq T\}$ be the process of consumption, or cash flow of participant $k$ , which follows the dynamics below:

dc_{k}(t)=rb_{k}(t)dt-db_{k}(t)-P(t)\nu_{k}(t)dt,\qquad 0\leq t\leq\mathcal{T}_{k};

(C1)

with

b_{k}(0)=0,\quad b_{k}(t)\geq 0\mbox{ for }0\leq t\leq\mathcal{T}_{k},\quad 0\leq X_{k}(t)\leq N(t)\mbox{ for }0\leq t\leq\mathcal{T}_{k}.

(C2)

Set $b_{k}(t)=b_{k}(\mathcal{T}_{k})$ and $\nu_{k}(t)=0$ for $t>\mathcal{T}_{k}$ .

In (C1), if $db_{k}(t)<0$ , the participant sells the risk-free asset to get cash either for buying stakes, or for consumption; if $db_{k}(t)>0$ , the participant adds more risk-free asset. Thus, (C1) is a self-financing condition in which $rb_{k}(t)dt-db_{k}(t)$ is the net change (in value of the risk-free asset held) used to finance new stakes $P(t)\nu_{k}(t)dt$ and consumption $dc(t)$ . The requirements in (C2) are all in the spirit of disallowing shorting on either the risk free asset $b_{k}(t)$ or the stakes $X_{k}(t)$ . In some PoS blockchains, there is a minimum requirement for bidding (e.g. 32 ETHs for Ethereum). In this case, we can impose a lower bound on the process $X_{k}(t)$ , to prevent it from falling below this threshold. The analysis will be similar. We also require that the trading strategy be bounded: there is $\overline{\nu}_{k}>0$ such that

|\nu_{k}(t)|\leq\overline{\nu}_{k}.

(C3)

The objective of participant $k$ is:

	$\displaystyle\sup_{\{(\nu_{k}(t),b_{k}(t))\}}$	$\displaystyle J(\nu_{k},b_{k}):=\mathbb{E}\left\{\int_{0}^{\mathcal{T}_{k}}e^{-\beta_{k}t}\left[dc_{k}(t)+\ell_{k}(X_{k}(t))dt\right]+e^{-\beta_{k}\mathcal{T}_{k}}\left[b_{k}(\mathcal{T}_{k})+h_{k}(X_{k}(\mathcal{T}_{k})\right]\right\}$		(2.3)
		$\displaystyle\mbox{ subject to }\eqref{eq:Xnu},(\mbox{C}1),(\mbox{C}2),(\mbox{C}3),$		(2.3)

where $\beta_{k}>0$ is a discount factor, a parameter measuring the risk sensitivity of participant $k$ ; $\ell_{k}(\cdot)$ and $h_{k}(\cdot)$ are two utility functions representing, respectively, the running profit and the terminal profit.

While generally following Merton’s consumption-investment framework, our formulation as presented above takes into account some distinct features of PoS blockchains and cryptocurrencies. One notable point is, the utilities $\ell$ and $h$ in the objective are expressed as functions of the number of stakes $X_{k}(t)$ , as opposed to their total value $P(t)X_{k}(t)$ . To the extent that $P(t)$ is treated as exogenous (as explained above), this difference may seem to be trivial. Yet, it is a reflection of the more substantial fact that crypto-participants tend to mentally decouple the utility of holding stakes from their monetary value at any given time. For instance, holding $1$ ETH may be equivalent to $\$5,000$ for one person, and $\$500$ for another, and neither will be influenced by the ETH market price at the time, which could be say, about $\$1,500$ .

Throughout below, the following conditions will be assumed:

Assumption 2.1.

(i)

$N:[0,T]\to\mathbb{R}_{+}$ is increasing with $N(0)=N>0$ , and $N\in\mathcal{C}^{2}([0,T])$ .
(ii)

$\ell:\mathbb{R}_{+}\to\mathbb{R}_{+}$ is increasing and $\ell\in\mathcal{C}^{1}(\mathbb{R}_{+})$ .
(iii)

$h:\mathbb{R}_{+}\to\mathbb{R}_{+}$ is increasing and $h\in\mathcal{C}^{1}(\mathbb{R}_{+})$ .

3. The Consumption-Investment Problem

Here we study the consumption-investment problem for participant $k$ in (2.3). To lighten notation, omit the subscript $k$ , and write out the problem in full as follows, where (C0) is a repeat of the state dynamics in (2.2):

$\displaystyle U(x):=\sup_{\{(\nu(t),b(t))\}}$	$\displaystyle J(\nu,b):=\mathbb{E}\left\{\int_{0}^{\mathcal{T}}e^{-\beta t}\left[dc(t)+\ \ell(X(t))dt\right]+e^{-\beta\mathcal{T}}\left[b(\mathcal{T})+h(X(\mathcal{T})\right]\right\}$	(3.1)
	$\displaystyle\mbox{ subject to }X^{\prime}(t)=\nu(t)+\frac{N^{\prime}(t)}{N(t)}X(t),\,X(0)=x,$	(C0)
	$\displaystyle\qquad\qquad\quad\,dc(t)=rb(t)dt-db(t)-P(t)\nu(t)dt,$	(C1)
	$\displaystyle\qquad\qquad\quad\,b(0)=0,\,b(t)\geq 0\mbox{ and }0\leq X(t)\leq N(t),$	(C2)
	$\displaystyle\qquad\qquad\quad\,\|\nu(t)\|\leq\overline{\nu}.$	(C3)

where $\mathcal{T}:=\inf\{t>0:X(t)=0\mbox{ or }N(t)\}\wedge T$ .

Note that the expectation in the objective function is with respect to $P(t)$ , which is involved in $dc_{k}(t)$ via (C1). Denote

\widetilde{P}_{\beta}(t):=e^{-\beta t}\mathbb{E}P(t),\qquad t\in[0,T].

(3.2)

Substituting the constraint (C1) into the objective function, and taking into account

rb(t)dt-db(t)=-e^{rt}d(e^{-rt}b(t)),

along with (3.2), we have

$\displaystyle J(\nu,b)$	$\displaystyle=$	$\displaystyle-\int_{0}^{\mathcal{T}}e^{(r-\beta)t}d(e^{-rt}b(t))+e^{-\beta\mathcal{T}}b(\mathcal{T})$	(3.3)
		$\displaystyle\qquad+\underbrace{\int_{0}^{\mathcal{T}}\big{[}-\widetilde{P}_{\beta}(t)\nu(t)+e^{-\beta t}\ell(X(t)\big{]}dt+e^{-\beta\mathcal{T}}h(X(\mathcal{T}))}_{J_{2}(\nu)}$
	$\displaystyle=$	$\displaystyle\underbrace{(r-\beta)\int_{0}^{\mathcal{T}}e^{-\beta t}b(t)dt}_{:=J_{1}(b)}\;+\;J_{2}(\nu),$

where $b(0)=0$ is used in the last equality. Hence,

U(x):=\sup_{\{(\nu,b)\}}J(\nu,b)=\sup_{b}J_{1}(b)+\sup_{\nu}J_{2}(\nu).

(3.4)

Next, suppose $\beta\geq r$ , a condition that will be assumed below (and readily justified as the risk premium associated with the valuation of any stake over the risk-free asset). Then, from the $J_{1}(b)$ expression in (3.3), and taking into account $b(t)\geq 0$ as constrained in (C2), we have $\sup_{b}J_{1}(b)=0$ with the optimality binding at $b_{*}(t)=0$ for all $t$ . Consequently, the problem in (3.1) is reduced to

U(x)=\sup_{\nu}J_{2}(\nu)\quad\mbox{subject to (C0), (C2'), (C3)},

(3.5)

where (C2’) is (C2) without the constraints on $b(\cdot)$ .

In summary, the key fact here is that the objective $U(x)$ is separable in the control variables $(\nu(t),b(t))$ ; hence the problem in (3.1) is decomposed into two optimal control problems, one on the risk-free asset $b(t)$ , and the other on the trading of stakes $\nu(t)$ , as specified in (3.3) and (3.4). Moreover, under the condition $\beta\geq r$ , the consumption-investment problem is reduced to the one in (3.5), where the objective function $J_{2}(\nu)$ – refer to (3.3) – takes the form of a tradeoff between the utility from holding stakes ( $\ell(X(t))$ and $h(X(\mathcal{T}))$ ) and the dis-utility of reducing consumption ( $-\widetilde{P}_{\beta}(t)\nu(t)$ ). Thus, the optimal trading strategy needs to strike a balance between these two opposing terms.

Before we present the optimal solution to the consumption-investment problem in (3.5), we make a digression to first study a simple degenerate case of $\widetilde{P}_{\beta}(t)\equiv 0$ . This special case removes the tradeoff mentioned above, so the solution becomes a one-sided strategy of always accumulating (or “hoarding”) the stakes at full capacity ( $\overline{\nu}$ ). Yet, as the analysis below will show, there are still some interesting (and subtle) issues involved. More importantly, this special case provides a very accessible path to finding the optimal solution via dynamic programming and the HJB equation.

3.1. Stake-hoarding

As motived above, here the problem for participant $k$ is reduced to the following (again, omit the subscript $k$ ):

	$\displaystyle U(x):=\sup_{\nu(t)}\,\,$	$\displaystyle\int_{0}^{\mathcal{T}}e^{-\beta t}\ \ell(X(t))dt+e^{-\beta\mathcal{T}}h(X(\mathcal{T}))$		(3.6)
		$\displaystyle\mbox{ subject to \quad(C0), (C2'), (C3)}.$

Below, we denote $\nu_{*}(t)$ for the optimal control process, $X_{*}(t)$ for the corresponding state process, and $\mathcal{T}_{*}:=\inf\{t>0:X_{*}(t)=N(t)\}\wedge T$ for the exit time.

Proposition 3.1.

Denote

\gamma(t):=\overline{\nu}N(t)\int_{0}^{t}\frac{ds}{N(s)}+\frac{xN(t)}{N}\quad\mbox{for }0\leq t\leq T.

(3.7)

We have:

(i)

If $\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\leq\frac{N-x}{N}$ , then $\mathcal{T}_{*}=T$ . The optimal control is $\nu_{*}(t)=\overline{\nu}$ for $0\leq t\leq T$ , the optimal state process is $X_{*}(t)=\gamma(t)$ for $0\leq t\leq T$ , and $U(x)=e^{-\beta T}h(X_{*}(T))+\int_{0}^{T}e^{-\beta t}\ell(X_{*}(t))dt$ .
(ii)

If $\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}>\frac{N-x}{N}$ , set

$t_{0}:=\inf\left\{t>0:\overline{\nu}\int_{0}^{t}\frac{ds}{N(s)}=\frac{N-x}{N}\right\}<T.$

Assume further that

$h(N(t))^{\prime}+\ell(N(t))\leq\beta h(N(t))\quad\mbox{for all }0\leq t\leq T.$ (3.8)

Then, $\mathcal{T}_{*}=t_{0}$ . The optimal strategy is $\nu_{*}(t)=\overline{\nu}$ for $0\leq t\leq t_{0}$ (and $\nu_{*}(t)=0$ for $t>t_{0}$ ), the optimal state process is $X_{*}(t)=\gamma(t)$ for $0\leq t\leq t_{0}$ (and $X_{*}(t)=N(t_{0})$ for $t>t_{0}$ ), and $U(x)=e^{-\beta t_{0}}h(X_{*}(t_{0}))+\int_{0}^{t_{0}}e^{-\beta t}\ell(X_{*}(t))dt$ .

Refer to caption — Figure 1. Optimal stake trading: concentration and monopoly.

Deferring the proof, we first make a few comments on the above proposition. Note that $\gamma(t)$ as specified in (3.7) is identified as the optimal state process $X_{*}(t)$ , which is the number of stakes given $\nu_{*}(t)=\overline{\nu}$ . It is easy to see that the participant’s share of stakes, $X_{*}(t)/N(t)$ , is increasing in $t$ , leading to centralization regardless of how the rewarding scheme is designed (although large rewards may slow down the speed towards concentration). The interesting point of the above theorem is in its part (ii), where the required condition (3.8) is a technical one, to ensure the optimality of $\nu_{*}(t)=\overline{\nu}$ . The more substantive fact is $\mathcal{T}_{*}=t_{0}<T$ , when $X(\mathcal{T}_{*})=N(\mathcal{T}_{*})$ , i.e., the participant has accumulated all stakes available in the system, leading to the extreme situation of monopoly (or “dictatorship”); and this is done before the end of the horizon, i.e., forcing a pre-matured exit time. See Figure 1 for an illustration.

The following corollary illustrates further this extreme phenomenon, with the polynomial family $N_{\alpha}(t)$ defined by (2.1), and with a long time horizon ( $T\to\infty$ ).

Corollary 3.2.

Let $(N_{\alpha}(t),\,0\leq t\leq T)$ be defined by (2.1), $(X_{\alpha,*}(t),\,0\leq t\leq T)$ be the optimal state process defined by (3.7) corresponding to $N_{\alpha}(t)$ , and $\mathcal{T}_{\alpha,*}:=\inf\{t>0:X_{\alpha,*}(t)=N_{\alpha}(t)\}$ be the exit time. Assume that the condition (3.8) holds for $N_{\alpha}(t)$ . Then, as $T\to\infty$ , we have

(i)
For $\alpha>1$ ,
- –
  
  if $\overline{\nu}\leq(\alpha-1)(N-x)N^{-\frac{1}{\alpha}}$ , then $X_{\alpha,*}(t)<N_{\alpha}(t)$ for all $t$ . Moreover,
  
  $\lim_{t\to\infty}\frac{X_{\alpha,*}(t)}{N_{\alpha}(t)}=\frac{\overline{\nu}}{\alpha-1}N^{\frac{1-\alpha}{\alpha}}+\frac{x}{N}.$ (3.9)
- –
  
  if $\overline{\nu}>(\alpha-1)(N-x)N^{-\frac{1}{\alpha}}$ , then $\mathcal{T}_{\alpha,*}<\infty$ .
(ii)

For $\alpha\leq 1$ , we have $\mathcal{T}_{\alpha,*}<\infty$ .

Proof.

Note that

\int_{0}^{T}\frac{dt}{N_{\alpha}(t)}=\left\{\begin{array}[]{lcl}\frac{1}{1-\alpha}\left((T+N^{\frac{1}{\alpha}})^{1-\alpha}-N^{\frac{1-\alpha}{\alpha}}\right)&\mbox{for }\alpha\neq 1\\ \log\left(1+T/N\right)&\mbox{for }\alpha=1.\end{array}\right.

(3.10)

As $T\to\infty$ , the dominant term in $\frac{1}{1-\alpha}\left((T+N^{\frac{1}{\alpha}})^{1-\alpha}-N^{\frac{1-\alpha}{\alpha}}\right)$ is $\frac{1}{\alpha-1}N^{\frac{1-\alpha}{\alpha}}$ if $\alpha>1$ , and is $\frac{1}{1-\alpha}T^{1-\alpha}$ if $\alpha<1$ ; and the dominant term in $\log\left(1+T/N\right)$ is $\log T$ . It then suffices to compare $\overline{\nu}\int_{0}^{T}\frac{dt}{N_{\alpha}(t)}$ to $\frac{N-x}{N}$ , and the rest of the corollary is immediate. ∎

This corollary shows a sharp phase transition towards monopoly in terms of the rewarding schemes. For $\alpha>1$ (increasing reward), there is a threshold for $\overline{\nu}$ , only above which monopoly may occur, and below which the share of stakes increases towards the value on the right side of (3.9). For $\alpha\leq 1$ (constant or decreasing reward), monopoly always occurs. Thus, these results have practical implications in the design of the PoS protocol. For instance, if/when certain participants have large capacities, adopting a suitable increasing reward scheme will counter the effect of concentration.

Now, returning to the proof of Proposition 3.1, we use the standard machinery of dynamic programming and the HJB equation. Consider the following problem, where $V(t,x)$ is the “value-to-go” function, for $0\leq t\leq T$ and $0\leq x\leq N(t)$ :

	$\displaystyle V(t,x):=$	$\displaystyle\max_{\{\nu(s),s\geq t\}}\,\,\int_{t}^{\mathcal{T}}e^{-\beta s}\ \ell(X(s))ds+e^{-\beta\mathcal{T}}h(X(\mathcal{T}))$
		$\displaystyle\mbox{ subject to }X^{\prime}(s)=\nu(s)+\frac{N^{\prime}(s)}{N(s)}X(s),\,X(t)=x,$
		$\displaystyle\qquad\qquad\quad\,0\leq X(s)\leq N(s),$
		$\displaystyle\qquad\qquad\quad\,\|\nu(s)\|\leq\overline{\nu}.$

Clearly, the solution to the above problem concerning $V(t,x)$ , for all $t\in[0,T]$ and $x\in[0,N(t)]$ , will yield the desired solution to $U(x)$ in (3.6), since $U(x)=V(0,x)$ . The following lemma identifies an HJB equation (with terminal and boundary conditions), to which. $V(t,x)$ is a solution.

Lemma 3.3.

Let $Q:=\{(t,x):0\leq t<T,\,0<x<N(t)\}$ . Then $V$ is the (unique) viscosity solution to the following HJB equation:

\left\{\begin{array}[]{lcl}\partial_{t}v+e^{-\beta t}\ell(x)+\frac{xN^{\prime}(t)}{N(t)}\partial_{x}v+\sup_{|\nu|\leq\overline{\nu}}\{\nu\,\partial_{x}v\}=0\quad(t,x)\in Q,\\ v(T,x)=e^{-\beta T}h(x),\\ v(t,0)=e^{-\beta t}h(0),\,\,v(t,N(t))=e^{-\beta t}h(N(t)).\end{array}\right.

(3.11)

Proof.

Write the HJB equation as $\partial_{t}v+H(t,x,\partial_{x}v)=0$ , where

H(t,x,p):=e^{-\beta t}\ell(x)+\frac{xN^{\prime}(t)}{N(t)}p+\sup_{|\nu|\leq\overline{\nu}}\{\nu p\}.

The fact that $V$ as specified in (3.1) is a viscosity solution follows a standard dynamic programming argument, see [11, Chapter II, Section 7].

Moreover, from the conditions in Assumption 2.1, we have,

|H(t,x,p)-H(s,y,q)|\leq C(|t-s|+|x-y|+|p-q|+|x-y||p|+|t-s||p|),

(3.12)

for $0\leq s,t\leq T$ and $0\leq x,y\leq N(t)$ , and for some $C>0$ . By [11, Chapter II, Corollary 9.1], the HJB equation in (3.11) has a unique viscosity solution, which then must be none other than $V$ . ∎

What remains is to pin down the term $\sup_{|\nu|\leq\overline{\nu}}\{\nu\,\partial_{x}v\}$ in the HJB equation, i.e., to identify the maximizing $\nu$ . Given the intuitive solution that $\nu=\overline{\nu}>0$ (a “conjecture,” so far), the HJB equation in 3.11 is expected to be

\left\{\begin{array}[]{lcl}\partial_{t}v+e^{-\beta t}\ell(x)+\left(\overline{\nu}+\frac{xN^{\prime}(t)}{N(t)}\right)\partial_{x}v=0\,\,\mbox{in }Q,\\ v(T,x)=e^{-\beta T}h(x),\\ v(t,0)=e^{-\beta t}h(0),\,\,v(t,N(t))=e^{-\beta t}h(N(t)),\end{array}\right.

(3.13)

which is a transport equation with variable coefficients. Now we solve the transport equation (3.13) by the method of characteristics. For $0\leq t\leq T$ and $0\leq x\leq N(t)$ , let $\gamma_{t,x}(s)$ be the solution to the following equation:

\gamma^{\prime}_{t,x}(s)=\overline{\nu}+\frac{N^{\prime}(s)}{N(s)}\gamma_{t,x}(s),\quad s>t;\qquad\gamma_{t,x}(t)=x.

(3.14)

A direct computation yields

\gamma_{t,x}(s)=\overline{\nu}N(s)\int_{t}^{s}\frac{du}{N(u)}+\frac{xN(s)}{N(t)},\quad s\geq t.

(3.15)

Under the regularity conditions in Assumption 2.1, it is standard that (see e.g. [2, 12])

v(t,x)=e^{-\beta\mathcal{T}_{t,x}}h(\gamma_{t,x}(\mathcal{T}_{t,x}))+\int_{t}^{\mathcal{T}_{t,x}}e^{-\beta s}\ell(\gamma_{t,x}(s))ds,

(3.16)

where $\mathcal{T}_{t,x}:=\inf\{s>t:\gamma_{t,x}(s)=N(s)\}\wedge T$ . We will next show that $v(t,x)$ given by (3.16) indeed solves the HJB equation (3.11), which then proves Proposition 3.1.

Proof of Proposition 3.1.

From the expression of $\gamma_{t,x}(s)$ in (3.15), we have

\partial_{x}\gamma_{t,x}(s)=\frac{N(s)}{N(t)}>0\quad\mbox{and}\quad\partial_{x}\mathcal{T}_{t,x}\leq 0.

(3.17)

Note that $\gamma_{t,x}(s)/N(s)$ is increasing in $s$ . There are two cases.

Case 1: If $\gamma_{t,x}(T)/N(T)\leq 1$ , then $\mathcal{T}_{t,x}=T$ and hence, $v(t,x)=e^{-\beta T}h(\gamma_{t,x}(T))+\int_{t}^{T}e^{-\beta s}\ell(\gamma_{t,x}(s))ds$ . By the regularity conditions in Assumption 2.1, we get

\partial_{x}v=e^{-\beta T}\frac{N(T)}{N(t)}h^{\prime}(\gamma_{t,x}(T))+\int_{t}^{T}e^{-\beta s}\frac{N(s)}{N(t)}\ell^{\prime}(\gamma_{t,x}(s))ds\geq 0,

where the non-negativity follows from the fact that $N(t)>0$ and $\ell,h$ are increasing.

Case 2: If $\gamma_{t,x}(T)/N(T)>1$ , then $\mathcal{T}_{t,x}<T$ , and hence $v(t,x)=e^{-\beta\mathcal{T}_{t,x}}h(N(\mathcal{T}_{t,x}))+\int_{t}^{\mathcal{T}_{t,x}}e^{-\beta s}\ell(\gamma_{t,x}(s))ds$ . As a result,

	$\displaystyle\partial_{x}v$	$\displaystyle=-\beta e^{-\beta\mathcal{T}_{t,x}}(\partial_{x}\mathcal{T}_{t,x})h(N(\mathcal{T}_{t,x}))+e^{-\beta\mathcal{T}_{t,x}}(\partial_{x}\mathcal{T}_{t,x})(h\circ N)^{\prime}(\mathcal{T}_{t,x})$
		$\displaystyle\quad\quad+\int_{t}^{\mathcal{T}_{t,x}}e^{-\beta s}\frac{N(s)}{N(t)}\ell^{\prime}(\gamma_{t,x}(s))ds+e^{-\beta\mathcal{T}_{t,x}}(\partial_{x}\mathcal{T}_{t,x})\ell(N(\mathcal{T}_{t,x}))$
		$\displaystyle=\int_{t}^{\mathcal{T}_{t,x}}e^{-\beta s}\frac{N(s)}{N(t)}\ell^{\prime}(\gamma_{t,x}(s))ds$
		$\displaystyle\quad\quad-e^{-\beta\mathcal{T}_{t,x}}\underbrace{(\partial_{x}\mathcal{T}_{t,x})}_{\leq 0\mbox{ \small by }\eqref{eq:310}}\underbrace{(-\ell\circ N-(h\circ N)^{\prime}+\beta h\circ N)}_{\geq 0\mbox{ \small by }\eqref{eq:tech}}(\mathcal{T}_{t,x})$
		$\displaystyle\geq 0.$

So, in both cases, we have $\partial_{x}v(t,x)\geq 0$ . Thus, $v(t,x)$ defined by (3.16) is a classical solution and hence, a viscosity solution to the HJB equation in (3.11). By Lemma 3.3, we conclude $V(t,x)=v(t,x)$ , and the optimal control is $\nu_{*}(s)=\overline{\nu}$ for $s\geq t$ . Specializing to $t=0$ yields the results in Proposition 3.1 (and $\gamma(t)$ defined by (3.7) is just $\gamma_{0,x}(t)$ ). ∎

3.2. Main theorem and proof

We are now ready to present the main result of this section, the optimal solution to $U(x)$ in (3.5) and hence to $U(x)$ in (3.1).

Theorem 3.4.

Assume that $r\leq\beta$ , and $\widetilde{P}_{\beta}(t)$ in (3.2) satisfies the Lipschitz condition:

|\widetilde{P}_{\beta}(t)-\widetilde{P}_{\beta}(s)|\leq C|t-s|\quad\mbox{for some }C>0.

(3.18)

Then, $U(x)=v(0,x)$ where $v(t,x)$ is the unique viscosity solution to the following HJB equation, where $Q:=\{(t,x):0\leq t<T,\,0<x<N(t)\}$ :

\left\{\begin{array}[]{lcl}\partial_{t}v+e^{-\beta t}\ell(x)+\frac{xN^{\prime}(t)}{N(t)}\partial_{x}v+\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\}=0\quad\mbox{in }Q,\\ v(T,x)=e^{-\beta T}h(x),\\ v(t,0)=e^{-\beta t}h(0),\,\,v(t,N(t))=e^{-\beta t}h(N(t)).\end{array}\right.

(3.19)

Moreover, the optimal strategy is $b_{*}(t)=0$ and $\nu_{*}(t)=\nu_{*}(t,X_{*}(t))$ for $0\leq t\leq\mathcal{T}_{*}$ , where $\nu_{*}(t,x)$ achieves the supremum in (3.19), and $X_{*}(t)$ solves $X_{*}^{\prime}(t)=\nu_{*}(t,X_{*}(t))+\frac{N^{\prime}(t)}{N(t)}X_{*}(t)$ with $X_{*}(0)=x$ , and $\mathcal{T}_{*}:=\inf\{t>0:X_{*}(t)=0\mbox{ or }N(t)\}\wedge T$ .

Proof.

Similar to the dynamic programming/HJB approach that proves Lemma 3.3 and Proposition 3.1 above, here we consider

	$\displaystyle V_{2}(t,x):=$	$\displaystyle\max_{\nu(s)}\,\,\int_{t}^{\mathcal{T}}(-\widetilde{P}_{\beta}(s)\nu(s)+e^{-\beta s}\ell(X(s))ds+e^{-\beta\mathcal{T}}h(X(\mathcal{T}))$
		$\displaystyle\mbox{ subject to }X^{\prime}(s)=\nu(s)+\frac{N^{\prime}(s)}{N(s)}X(s),\,X(t)=x,$
		$\displaystyle\qquad\qquad\quad\,0\leq X(s)\leq N(s),$
		$\displaystyle\qquad\qquad\quad\,\|\nu(s)\|\leq\overline{\nu},$

so that $U(x)=V_{2}(0,x)$ . By the same dynamic programming argument as above, $V_{2}$ solves in the viscosity sense the HJB equation in (3.19), which can be expressed as $\partial_{t}v+H(t,x,\partial_{x}v)=0$ , with

H(t,x,p):=e^{-\beta t}\ell(x)+\frac{xN^{\prime}(t)}{N(t)}p+\sup_{|\nu|\leq\overline{\nu}}\{\nu(p-\widetilde{P}_{\beta}(t))\}.

It is readily checked that under Assumption 2.1 and the Liptschiz condition in (3.18), the inequality in (3.12) holds. Thus, $V_{2}$ as identified above is the unique viscosity to the HJB equation in (3.19). The rest of the theorem is straightforward. ∎

Comparing the HJB equations in (3.11) and in (3.19), we see the nonlinear term changes from $\sup_{|\nu|\leq\overline{\nu}}\{\nu\partial_{x}v\}$ in the stake-hoarding problem, to $\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\}$ in the stake-trading problem, the latter being the general consumption-investment problem. The more general HJB equation in (3.19) does not have a closed-form solution, and neither does the optimal trading strategy $\nu_{*}(t)$ . This calls for numerical methods; see e.g. [19, 23].

4. Linear and Convex Utilities

4.1. Linear utility

Consider the special case of linear utility, $\ell(x)=\ell x$ and $h(x)=hx$ , for some given (positive) constants $\ell$ and $h$ . In this case we can derive a closed-form solution to the HJB equation in (3.19), and then derive the optimal strategy $\nu_{*}(t)$ (in terms of $\widetilde{P}_{\beta}(t)$ ).

To start with, the HJB equation in (3.19) now specializes to the following, with $Q:=\{(t,x):0\leq t<T,\,0<x<N(t)\}$ (as before, refer to Lemma 3.3):

\left\{\begin{array}[]{lcl}\partial_{t}v+\ell e^{-\beta t}x+\frac{xN^{\prime}(t)}{N(t)}\partial_{x}v+\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\}=0\quad(t,x)\in Q,\\ v(T,x)=hx,\\ v(t,0)=0,\,v(t,N(t))=hN(t).\end{array}\right.

(4.1)

For the nonlinear term $\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\}$ , we have $\nu_{*}(t,x)=\overline{\nu}$ if $\partial_{x}v(t,x)\geq\widetilde{P}_{\beta}(t)$ , and $\nu_{*}(t,x)=\overline{\nu}$ if $\partial_{x}v(t,x)<\widetilde{P}_{\beta}(t)$ .

Next, presuming that $\partial_{x}v\geq\widetilde{P}_{\beta}(t)$ , and ignoring the boundary conditions, the HJB equation in (4.1) becomes

\partial_{t}v+\ell e^{-\beta t}x-\overline{\nu}\widetilde{P}_{\beta}(t)+\left(\overline{\nu}+\frac{xN^{\prime}(t)}{N(t)}\right)\partial_{x}v=0,\quad v(T,x)=hx,

which has the (classical) solution

v^{+}(t,x):=he^{-\beta T}\gamma^{+}_{t,x}(T)+\int_{t}^{T}\left[\ell e^{-\beta s}\gamma^{+}_{t,x}(s)-\overline{\nu}\widetilde{P}_{\beta}(s)\right]ds,

(4.2)

where

\gamma^{+}_{t,x}(s):=\overline{\nu}N(s)\int_{t}^{s}\frac{du}{N(u)}+\frac{xN(s)}{N(t)},\quad s\in[t,T].

(4.3)

Similarly, presuming that $\partial_{x}v<\widetilde{P}_{\beta}(t)$ and neglecting the boundary conditions turns the HJB equation in (3.19) into the following form:

\partial_{t}v+\ell e^{-\beta t}x+\overline{\nu}\widetilde{P}_{\beta}(t)+\left(-\overline{\nu}+\frac{xN^{\prime}(t)}{N(t)}\right)\partial_{x}v=0,\quad v(T,x)=hx,

which has the solution

v^{-}(t,x):=he^{-\beta T}\gamma_{t,x}^{-}(T)+\int_{t}^{T}\left[\ell e^{-\beta s}\gamma^{-}_{t,x}(s)+\overline{\nu}\widetilde{P}_{\beta}(s)\right]ds,

(4.4)

where

\gamma^{-}_{t,x}(s):=-\overline{\nu}N(s)\int_{t}^{s}\frac{du}{N(u)}+\frac{xN(s)}{N(t)},\quad s\in[t,T].

(4.5)

The key observation is that

\partial_{x}v^{+}(t,x)=\partial_{x}v^{-}(t,x)=\underbrace{\frac{1}{N(t)}\left(he^{-\beta T}N(T)+\ell\int_{t}^{T}e^{-\beta s}N(s)ds\right)}_{:=\Psi(t)};

(4.6)

and $\Psi(t)$ , notably independent of $x$ , is decreasing in $t\in[0,T]$ :

\displaystyle\Psi(0)=he^{-\beta T}\frac{N(T)}{N}+\frac{\ell}{N}\int_{0}^{T}e^{-\beta t}N(t)dt\;\downarrow(\geq)\;\Psi(t)\;\downarrow(\geq)\;\Psi(T)=he^{-\beta T}.

(4.7)

This suggests that $\nu_{*}(t)=\overline{\nu}$ (buy all the time) if $\sup_{[0,T]}\widetilde{P}_{\beta}(t)\leq\Psi(T)$ ; and $\nu_{*}(t)=-\overline{\nu}$ (sell all the time) if $\inf_{[0,T]}\widetilde{P}_{\beta}(t)\geq\Psi(0)$ . Various other scenarios are also possible, such as first buy then sell, or first sell then buy, and so forth.

The following proposition classifies all possible optimal strategies corresponding to $\widetilde{P}_{\beta}(t)$ as specified above, which we will comment on later.

Proposition 4.1.

Let $\ell(x)=\ell x$ and $h(x)=hx$ with $\ell,h>0$ , and $N(t)$ satisfy Assumption 2.1 (i). Assume that $\widetilde{P}_{\beta}(t)$ satisfies the Lipschitz condition in (3.18), and that $\overline{\nu}$ satisfies the following:

\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\leq\frac{x}{N}\wedge\frac{N-x}{N}.

(4.8)

Then, the following results hold:

(i)
Suppose $\widetilde{P}_{\beta}(t)$ stays constant, i.e., for all $t\in[0,T]$ , $\widetilde{P}(t)=\widetilde{P}(0)=P(0)$ .
1. (a)
  
  If $P(0)\geq\Psi(0)$ , then $\nu_{*}(t)=-\overline{\nu}$ for all $0\leq t\leq T$ . That is, the participant sells at all time at full capacity.
2. (b)
  
  If $P(0)\leq\Psi(T)$ , then $\nu_{*}(t)=\overline{\nu}$ . That is, the participant purchases at all time at full capacity.
3. (c)
  
  If $\Psi(T)<P(0)<\Psi(0)$ , then
  
  $\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{0},\\ -\overline{\nu}&\mbox{for }t>t_{0},\end{array}\right.$
  
  where $t_{0}$ is the unique point in $[0,T]$ such that $P(0)=\Psi(t_{0})$ with $\Psi(t)$ defined in (4.6). That is, the participant first buys and after some time sells, both at full capacity.
(ii)
Suppose that $\widetilde{P}_{\beta}(t)$ is increasing in $t\in[0,T]$ .
1. (a)
  
  If $P(0)\geq\Psi(0)$ , then $\nu_{*}(t)=-\overline{\nu}$ for all $0\leq t\leq T$ . That is, the participant sells all the time at full capacity.
2. (b)
  
  If $\widetilde{P}_{\beta}(T)\leq\Psi(T)$ , then $\nu_{*}(t)=\overline{\nu}$ . That is, the participant purchases all the time at full capacity.
3. (c)
  
  If $P(0)<\Psi(0)$ and $\widetilde{P}_{\beta}(T)>\Psi(T)$ , then
  
  $\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{0},\\ -\overline{\nu}&\mbox{for }t>t_{0},\end{array}\right.$
  
  where $t_{0}$ is the unique point of intersection of $\widetilde{P}_{\beta}(t)$ and $\Psi(t)$ on $[0,T]$ . That is, the participant first buys and after some time sells, both at full capacity.
(iii)
Suppose that $\widetilde{P}_{\beta}(t)$ is decreasing in $t\in[0,T]$ .
1. (a)
  
  If $P(0)\geq\Psi(0)$ , then the participant first sells, and may then buy, etc, always (buy or sell) at full capacity, according to the crossings of $\widetilde{P}_{\beta}(t)$ and $\Psi(t)$ in $[0,T]$ .
2. (b)
  
  If $P(0)<\Psi(0)$ , then the participant first buys, and may then sell, etc, always (buy or sell) at full capacity, according to the crossings of $\widetilde{P}_{\beta}(t)$ and $\Psi(t)$ in $[0,T]$ .

Proof.

Recall that $X_{*}(t)$ is the state process (number of stakes) corresponding to the optimal strategy $\nu_{*}(t)$ , which, as stipulated in the rest of the proposition, will be equal to either $\overline{\nu}$ or $-\overline{\nu}$ . The condition in (4.8) then ensures that $0\leq X_{*}(t)\leq N(t)$ for all $t\in[0,T]$ , so $\mathcal{T}_{*}=T$ (i.e., there will no forced early exit).

Thus, it suffices to find the optimal strategy $\nu_{*}(t)$ from

\sup_{|\nu|\leq\overline{\nu}}\{\nu[\partial_{x}v-\widetilde{P}_{\beta}(t)]\}=\sup_{|\nu|\leq\overline{\nu}}\{\nu[\Psi(t)-\widetilde{P}_{\beta}(t)]\}.

(i) and (ii). Since $\Psi(t)$ is decreasing and $\widetilde{P}(t)$ is either constant or increasing, $\Psi(t)-P(t)$ is decreasing. Hence, we have the following cases (for both (i) and (ii)).

(a) If $\widetilde{P}(0)=P(0)\geq\Psi(0)$ , then $\widetilde{P}(t)\geq\Psi(t)$ for all $t\in[0,T]$ ; hence, $\nu_{*}(t)=-\overline{\nu}$ , and $U(x)=v^{-}(0,x)$ .

(b) Similarly, if $P(0)\leq\Psi(T)$ , then $\widetilde{P}(t)\leq\Psi(t)$ for all $t\in[0,T]$ ; hence, $\nu_{*}(t)=\overline{\nu}$ , and $U(x)=v^{+}(0,x)$ .

(c) Otherwise, there will be a unique point for $\Psi(t)-P(t)$ (which is decreasing in $t$ ) to cross $0$ from above, and let $t_{0}\in[0,T]$ denote the crossing point. This implies that $\nu_{*}(t)=\overline{\nu}$ for $t\leq t_{0}$ , and $\nu_{*}(t)=-\overline{\nu}$ for $t>t_{0}$ ; and

U(x)=v^{+}(0,x)-v^{+}(t_{0},\gamma^{+}_{0,x}(t_{0}))+v^{-}(t_{0},\gamma^{+}_{0,x}(t_{0})).

Part (iii) is similarly argued, the only complication is that $\Psi(t)-P(t)$ is now non-monotone, and hence, there will be multiple points when it crosses $0$ . ∎

Several remarks are in order. First note that the condition in (4.8) is to guarantee the constraint (C2’) not activated prior to $T$ ; that is, to exclude the possibility of monopoly/dictatorship that will trigger a forced early exit. This condition may well be removed, but then we would expect another condition similar to the one in (3.8) to guarantee the optimality of a strategy when an early exit occurs.

Second, $\widetilde{P}_{\beta}(t)=\mathbb{E}\big{[}e^{-\beta t}P(t)\big{]}$ combines $\beta$ , which measures the participant’s sensitivity towards risk, with the stake price $P(t)$ . Thus, the monotone properties of $\widetilde{P}_{\beta}(t)$ , which classify the three parts (i)-(iii) in Proposition 4.1, naturally connect to martingale pricing: $\widetilde{P}_{\beta}(t)$ being a constant in (i) makes the process $e^{-\beta t}P(t)$ a martingale; whereas $\widetilde{P}_{\beta}(t)$ increasing or decreasing, respectively in (ii) and (iii), makes $e^{-\beta t}P(t)$ a sub-martingale or a super-martingale.

On the other hand, the function $\Psi(t)=\partial_{x}v^{+}(t,x)=\partial_{x}v^{-}(t,x)$ represents the rate of return of the participant’s utility (from holding of stakes, $x$ ); and interestingly, in the linear utility case, this return rate is independent of $x$ while decreasing in $t$ . Thus, the trading strategy is completely determined by comparing this return rate $\Psi(t)$ with the participant’s risk-adjusted stake price (or, valuation) $\widetilde{P}_{\beta}(t)$ : if $\Psi(t)\geq({\rm resp.}<)\widetilde{P}_{\beta}(t)$ , then the participant will buy (resp. sell) stakes.

Specifically, following (i) and (ii) of Proposition 4.1, for a constant or an increasing $\widetilde{P}_{\beta}(t)$ (corresponding to a risk-neutral or risk-seeking participant), there are only three possible optimal strategies: buy all the time, sell all the time, or first buy then sell. (The first-buy-then-sell strategy echoes the general investment practice that an early investment pays off in a later day.) See Figure 2 for an illustration.

4.2. A special case

In part (iii) of Proposition 4.1, when $\widetilde{P}_{\beta}(t)$ is decreasing in $t$ , like $\Psi(t)$ , the multiple crossings between the two decreasing functions can be further pinned down when there’s more model structure. Consider, for instance, when $P(t)$ follows a geometric Brownian motion (GBM):

\frac{dP(t)}{P(t)}=\mu dt+\sigma dB_{t},\quad{\rm or}\quad P(t)=P(0)e^{(\mu-\sigma^{2}/2)t+\sigma B_{t}};\quad t\in[0,T],

(4.9)

where $\{B_{t}\}$ denotes the standard Brownian motion; and $\mu>0$ and $\sigma>0$ are the two parameters of the GBM model, representing the rate of return and the volatility of $\{P(t)\}$ . From the second equation in (4.9), we have $\mathbb{E}P(t)=P(0)e^{\mu t}$ ; hence, $\widetilde{P}_{\beta}(t)=P(0)e^{-(\beta-\mu)t}$ . Then, a decreasing $\widetilde{P}_{\beta}(t)$ corresponds to $\beta>\mu$ . From (4.6), we can derive

\Psi^{\prime}(t)=-\frac{N^{\prime}(t)}{N(t)}\Psi(t)-\ell e^{-\beta t},

and hence,

\left(\Psi(t)-\widetilde{P}_{\beta}(t)\right)^{\prime}=-\frac{N^{\prime}(t)}{N(t)}\Psi(t)-\ell e^{-\beta t}+(\beta-\mu)P(0)e^{-(\beta-\mu)t}.

(4.10)

Let $\Psi_{\alpha}(t)$ denote $\Psi(t)$ for $N(t)=N_{\alpha}(t)$ defined by (2.1). The following proposition gives the conditions under which $\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)$ is monotone in the regime $N\to\infty$ , and optimal strategies are derived accordingly.

Proposition 4.2.

Suppose the assumptions in Proposition 4.1 hold, with $N(t)=N_{\alpha}(t)$ and $\{P(t)\}$ specified by (4.9) with $\beta>\mu$ . As $N\to\infty$ , we have the following results:

•

If for some $\varepsilon>0$ ,

P(0)>\frac{1}{\beta-\mu}\left(\frac{\alpha he^{-\mu T}(N^{\frac{1}{\alpha}}+T)^{\alpha}}{N^{1+\frac{1}{\alpha}}}+\frac{\alpha\ell\beta^{-1}}{N^{\frac{1}{\alpha}}}+\ell\right)+\frac{\varepsilon}{N^{\frac{1}{\alpha}}},

(4.11)

then $\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)$ is increasing on $[0,T]$ .

•

If for some $\varepsilon>0$ ,

P(0)<\frac{1}{\beta-\mu}\left(\frac{\alpha he^{-\beta T}}{N^{\frac{1}{\alpha}}+T}+\ell e^{-\mu T}\right)-\frac{\varepsilon}{N^{\frac{1}{\alpha}}},

(4.12)

then $\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)$ is decreasing on $[0,T]$ .

Consequently, we have:

(a)

If $P(0)>e^{(\beta-\mu)T}\Psi_{\alpha}(T)$ and (4.11) holds, or $P(0)>\Psi_{\alpha}(0)$ and (4.12) holds, then $\nu_{*}(t)=-\overline{\nu}$ for all $t$ . That is, the participant sells all the time at full capacity.
(b)

If $\Psi_{\alpha}(0)\leq P(0)<e^{(\beta-\mu)T}\Psi_{\alpha}(T)$ and (4.11) holds, then $\nu_{*}(t)=-\overline{\nu}$ for $t\leq t_{0}$ and $\nu_{*}(t)=\overline{\nu}$ for $t>t_{0}$ , where $t_{0}$ is the unique point of intersection of $\widetilde{P}_{\beta}(t)$ and $\Psi_{\alpha}(t)$ on $[0,T]$ . That is, the participant first sells (before $t_{0}$ ) and then buys (after $t_{0}$ ), both at full capacity.
(c)

If $e^{(\beta-\mu)T}\Psi_{\alpha}(T)\leq P(0)<\Psi_{\alpha}(0)$ and (4.12) holds, then $\nu_{*}(t)=\overline{\nu}$ for $t\leq t_{0}$ and $\nu_{*}(t)=-\overline{\nu}$ for $t>t_{0}$ , where $t_{0}$ is the unique point of intersection of $\widetilde{P}_{\beta}(t)$ and $\Psi_{\alpha}(t)$ on $[0,T]$ . That is, the participant first buys (before $t_{0}$ ) and then sells (after $t_{0}$ ), both at full capacity.
(d)

If $P(0)<e^{(\beta-\mu)T}\Psi_{\alpha}(T)$ and (4.12) holds, or $P(0)<\Psi_{\alpha}(0)$ and (4.11) holds, then $\nu_{*}(t)=\nu$ for all $t$ . That is, the participant buys all the time at full capacity.

Proof.

Note that $\frac{N^{\prime}_{\alpha}(t)}{N_{\alpha}(t)}=\alpha(N^{\frac{1}{\alpha}}+t)^{-1}$ , and

\int_{t}^{T}e^{-\beta s}N_{\alpha}(s)ds=e^{\beta N^{\frac{1}{\alpha}}}\beta^{-\alpha-1}\left(\Gamma(\alpha+1,\beta(N^{\frac{1}{\alpha}}+t))-\Gamma(\alpha+1,\beta(N^{\frac{1}{\alpha}}+T))\right),

where $\Gamma(a,x):=\int_{x}^{\infty}t^{a-1}e^{-t}dt$ is the incomplete Gamma function. As $N\to\infty$ , we have

\int_{t}^{T}e^{-\beta s}N_{\alpha}(s)ds=\beta^{-1}\left(e^{-\beta t}N_{\alpha}(t)-e^{-\beta T}N_{\alpha}(T)\right)+o(N),

which together with (4.6) and (4.10) implies that

	$\displaystyle\left(\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)\right)^{\prime}=-\frac{\alpha}{N^{\frac{1}{\alpha}}+t}$	$\displaystyle\left[\frac{he^{-\beta T}N_{\alpha}(T)}{N_{\alpha}(t)}+\ell\beta^{-1}\left(e^{-\beta t}-e^{-\beta T}\frac{N_{\alpha}(T)}{N_{\alpha}(t)}\right)+o(1)\right]$		(4.13)
		$\displaystyle-\ell e^{-\beta t}+(\beta-\mu)P(0)e^{-(\beta-\mu)t}.$		(4.13)

Multiplying the RHS of (4.13) by $e^{(\beta-\mu)t}$ , we get

	$\displaystyle-\frac{\alpha}{N^{\frac{1}{\alpha}}+t}$	$\displaystyle\left[\frac{he^{-\beta(t-T)-\mu t}N_{\alpha}(T)}{N_{\alpha}(t)}+\ell\beta^{-1}\left(e^{-\mu t}-e^{-\beta(t-T)-\mu t}\frac{N_{\alpha}(T)}{N_{\alpha}(t)}\right)+o(1)\right]$
		$\displaystyle-\ell e^{-\mu t}+(\beta-\mu)P(0).$

Clearly, the sum of all the terms above is lower bounded by

-\left(\frac{\alpha he^{-\mu T}N_{\alpha}(T)}{N^{1+\frac{1}{\alpha}}}+\alpha\ell\beta^{-1}N^{-\frac{1}{\alpha}}+\ell\right)+(\beta-\mu)P(0)\stackrel{{\scriptstyle\eqref{eq:diffinc}}}{{>}}0,

which implies that $\inf_{[0,T]}\left(\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)\right)^{\prime}>0$ , and hence, $\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)$ is increasing.

Moreover, the term is upper bounded by

-\left(\frac{\alpha he^{-\beta T}}{N^{\frac{1}{\alpha}}+T}+\ell e^{-\mu T}\right)+(\beta-\mu)P(0)\stackrel{{\scriptstyle\eqref{eq:diffdec}}}{{<}}0,

which implies that $\sup_{[0,T]}\left(\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)\right)^{\prime}<0$ , and hence, $\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)$ is decreasing.

(a) If $P(0)>e^{(\beta-\mu)T}\Psi_{\alpha}(T)$ and (4.11) holds, then $\Psi_{\alpha}(T)<\widetilde{P}_{\beta}(T)$ and $\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)$ is increasing. If $P(0)>\Psi_{\alpha}(0)$ and (4.12) holds, then $\Psi_{\alpha}(0)<\widetilde{P}_{\beta}(0)$ and $\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)$ is decreasing. In both cases, we have $\Psi_{\alpha}(t)-\widetilde{P}_{\beta}(t)<0$ for all $t$ .

(b) (c) (d) follow the same argument as (a). ∎

See Figure 3 below for an illustration of the results in the above proposition. Also note that the connection to the participant’s risk sensitivity as remarked at the end of §4.1 can also be made more explicit when the price process $P(t)$ follows the GBM model in (4.9), for which we have $\widetilde{P}_{\beta}(t)=P(0)e^{-(\beta-\mu)t}$ . Then, the three cases in Proposition 4.1 correspond to $\beta=\mu$ (martingale), $\beta<\mu$ (sub-martingale), and $\beta>\mu$ (super-martingale). According to the three ranges of $\beta$ , they can be viewed as representing the participant as risk-neutral, risk-seeking and risk-averse.

4.3. Convex utility

It is possible to extend the above results to more general, non-linear utility functions $\ell(\cdot)$ and $h(\cdot)$ , by following the same approach as above that leads to $v^{+}(t,x)$ and $v^{-}(t,x)$ in (4.2) and (4.4).

Specifically, considering the two cases of $\partial_{x}v\geq\widetilde{P}_{\beta}(t)$ , and $\partial_{x}v<\widetilde{P}_{\beta}(t)$ , we can derive

	$\displaystyle v^{+}(t,x):=e^{-\beta T}h\big{(}\gamma^{+}_{t,x}(T)\big{)}+\int_{t}^{T}\left[e^{-\beta s}\ell\big{(}\gamma^{+}_{t,x}(s)\big{)}-\overline{\nu}\widetilde{P}_{\beta}(s)\right]ds,$		(4.14)
	$\displaystyle v^{-}(t,x):=e^{-\beta T}h\big{(}\gamma_{t,x}^{-}(T)\big{)}+\int_{t}^{T}\left[e^{-\beta s}\ell\big{(}\gamma^{-}_{t,x}(s)\big{)}+\overline{\nu}\widetilde{P}_{\beta}(s)\right]ds;$		(4.15)

whereas $\gamma_{t,x}^{+}$ and $\gamma_{t,x}^{-}$ remain the same as in (4.3) and (4.5).

The $\Psi$ function in (4.6) now splits into two functions: for $(t,x)\in Q:=\{(t,x):0\leq t<T,\,0<x<N(t)\}$ , we have

\displaystyle\partial_{x}v^{+}(t,x)=\underbrace{\frac{1}{N(t)}\left(e^{-\beta T}N(T)h^{\prime}\big{(}\gamma_{t,x}^{+}(T)\big{)}+\int_{t}^{T}e^{-\beta s}N(s)\ell^{\prime}\big{(}\gamma^{+}_{t,x}(s)\big{)}ds\right)}_{:=\Psi^{+}(t,x)},

(4.16)

and

\displaystyle\partial_{x}v^{-}(t,x)=\underbrace{\frac{1}{N(t)}\left(e^{-\beta T}N(T)h^{\prime}\big{(}\gamma_{t,x}^{-}(T)\big{)}+\int_{t}^{T}e^{-\beta s}N(s)\ell^{\prime}\big{(}\gamma^{-}_{t,x}(s)\big{)}ds\right)}_{:=\Psi^{-}(t,x)}.

(4.17)

Note that both $\Psi^{+}$ and $\Psi^{-}$ depend on $x$ (as well as on $t$ ), via $\gamma^{+}_{t,x}$ and $\gamma^{-}_{t,x}$ . This dependence makes it necessary to take a closer look at $\gamma^{+}_{t,x}$ and $\gamma^{-}_{t,x}$ , since the $x=x(t)$ involved in both depends on the control $\nu$ before (and up to) $t$ . We have the following cases: for $s\geq t$ ,

	$\displaystyle{\rm if}\;x=\gamma^{+}_{0,x}(t),\;{\rm then}$	$\displaystyle\gamma^{+}_{+}(s):=\gamma^{+}_{t,x}(s)=\left(\overline{\nu}\int_{t}^{s}\frac{du}{N(u)}+\overline{\nu}\int_{0}^{t}\frac{du}{N(u)}+\frac{x}{N}\right)N(s),$			(4.18)
	$\displaystyle{\rm if}\;x=\gamma^{-}_{0,x}(t),\;{\rm then}$	$\displaystyle\gamma^{-}_{-}(s):=\gamma^{-}_{t,x}(s)=\left(-\overline{\nu}\int_{t}^{s}\frac{du}{N(u)}-\overline{\nu}\int_{0}^{t}\frac{du}{N(u)}+\frac{x}{N}\right)N(s).$			(4.19)

In other words, $\gamma^{+}_{+}$ corresponds to $\nu=\overline{\nu}$ both before and after $t$ , whereas $\gamma^{-}_{-}$ corresponds to $\nu=-\overline{\nu}$ both before and after $t$ . The other two cases are similar:

	$\displaystyle{\rm if}\;x=\gamma^{+}_{0,x}(t),\;{\rm then}$	$\displaystyle\gamma^{-}_{+}(s):=\gamma^{-}_{t,x}(s)=\left(-\overline{\nu}\int_{t}^{s}\frac{du}{N(u)}+\overline{\nu}\int_{0}^{t}\frac{du}{N(u)}+\frac{x}{N}\right)N(s),$			(4.20)
	$\displaystyle{\rm if}\;x=\gamma^{-}_{0,x}(t),\;{\rm then}$	$\displaystyle\gamma^{+}_{-}(s):=\gamma^{+}_{t,x}(s)=\left(\overline{\nu}\int_{t}^{s}\frac{du}{N(u)}-\overline{\nu}\int_{0}^{t}\frac{du}{N(u)}+\frac{x}{N}\right)N(s);$			(4.21)

where $\gamma^{-}_{+}$ corresponds to $\nu=\overline{\nu}$ before (and up to) $t$ and $\nu=-\overline{\nu}$ after $t$ , and $\gamma^{+}_{-}$ corresponds to the other way around.

Substituting these four cases into $\Psi^{+}$ and $\Psi^{-}$ in (4.16) and (4.17) further splits the latter two into four cases:

	$\displaystyle\Psi^{+}_{+}(t):=\Psi^{+}(t,\gamma^{+}_{0,x}(t)),\quad\Psi^{-}_{-}(t):=\Psi^{-}(t,\gamma^{-}_{0,x}(t));$		(4.22)
	$\displaystyle\Psi^{-}_{+}(t):=\Psi^{-}(t,\gamma^{+}_{0,x}(t)),\quad\Psi^{+}_{-}(t):=\Psi^{+}(t,\gamma^{-}_{0,x}(t)).$		(4.23)

All four are now functions of $t$ only, as $x$ has been replaced by either $\gamma^{+}_{0,x}(t)$ or $\gamma^{-}_{0,x}(t)$ .

Clearly, from (4.18)-(4.21) above, we have

\displaystyle\partial_{t}\gamma^{+}_{+}(s)=\partial_{t}\gamma^{-}_{-}(s)=0,\quad\partial_{t}\gamma^{-}_{+}(s)=\frac{2\overline{\nu}N(s)}{N(t)}>0,\quad\partial_{t}\gamma^{+}_{-}(s)=-\frac{2\overline{\nu}N(s)}{N(t)}<0.

(4.24)

Now, suppose $\ell(\cdot)$ and $h(\cdot)$ are both smooth, convex (and increasing) functions. Hence, $\ell^{\prime}(\cdot)\geq 0$ and $h^{\prime}(\cdot)\geq 0$ , and both are increasing functions. Then, it is readily verified:

(i)

Both $\Psi^{+}_{+}(t)$ and $\Psi^{-}_{-}(t)$ are decreasing in $t\in[0,T]$ , and so is $\Psi^{+}_{-}(t)$ ; whereas $\Psi^{-}_{+}(t)$ could be both increasing and decreasing (i.e., non-monotone).
(ii)

Furthermore, $\Psi^{+}_{+}(t)\geq\Psi^{-}_{-}(t)$ for all $t\in[0,T]$ .

For instance, for $\Psi^{+}_{+}(t)$ in (i), consider

$\displaystyle\partial_{t}\Psi^{+}_{+}(t)$	$\displaystyle=$	$\displaystyle e^{-\beta T}N(T)\left(\frac{h^{{}^{\prime\prime}}(\gamma^{+}_{+}(T))\partial_{t}\gamma^{+}_{+}(T)}{N(t)}-\frac{h^{\prime}(\gamma^{+}_{+}(T))N^{\prime}(t)}{N^{2}(t)}\right)$	(4.25)
		$\displaystyle-\frac{N^{\prime}(t)}{N^{2}(t)}\int_{t}^{T}e^{-\beta s}N(s)\ell^{\prime}(\gamma^{+}_{+}(s))ds-e^{-\beta t}\ell^{\prime}(\gamma^{+}_{+}(t))$
		$\displaystyle+\frac{1}{N(t)}\int_{t}^{T}e^{-\beta s}N(s)\ell^{{}^{\prime\prime}}(\gamma^{+}_{+}(s))\partial_{t}\gamma^{+}_{+}(s)ds\quad\leq 0,$

where $\leq 0$ follows from $\partial_{t}\gamma^{+}_{+}(\cdot)=0$ in both the first and last terms on the RHS. The other two cases, $\partial_{t}\Psi^{-}_{-}(t)\leq 0$ and $\partial_{t}\Psi^{+}_{-}(t)\leq 0$ , are similarly verified.

As in the case of linear utility, the properties above can be used to compare against $\widetilde{P}_{\beta}(t)$ to identify the optimal trading strategy. Consider the case of $\widetilde{P}_{\beta}(t)$ being a constant, $\widetilde{P}_{\beta}(t)=P(0)$ for all $t\in[0,T]$ , as in part (i) of Proposition 4.1. If $\Psi^{+}_{+}(t)\geq\Psi^{-}_{-}(t)\geq P(0)$ for all $t\in[0,t]$ , then the optimal strategy is to buy all the time and at rate $\overline{\nu}$ . If $P(0)\geq\Psi^{+}_{+}(t)>\Psi^{-}_{-}(t)$ for all $t\in[0,t]$ , then it is optimal to sell all the time, at full capacity.

On the other hand, since $\Psi^{+}_{-}$ corresponds to sell first (before $t$ ) and then buy, this clearly cannot be optimal, as it is impossible for $\Psi^{+}_{-}\leq P(0)$ before $t$ and $\Psi^{+}_{-}\geq P(0)$ after $t$ , since $\Psi^{+}_{-}$ is decreasing in $t$ . Similarly, $\Psi^{-}_{+}$ corresponds to buy first (before $t$ ) and then sell, which can be optimal provided if $\Psi^{-}_{+}(t)$ is decreasing in $t$ .

The details are stated in the following proposition; and see Figure 4 for an illustration.

Proposition 4.3.

Assume that $\ell(\cdot)$ and $h(\cdot)$ are twice continuously differentiable, convex, and satisfy the conditions in Assumption 2.1. Assume that $\widetilde{P}_{\beta}(t)$ stays constant, i.e. $\widetilde{P}_{\beta}(t)=P(0)$ for all $t\in[0,T]$ . Further assume the condition (4.8), and that $t\to\Psi^{-}_{+}(t)$ is decreasing then

(a)

If $P(0)\geq\Psi^{+}_{+}(T)\vee\Psi^{-}(0,x)$ , then $\nu_{*}(t)=-\overline{\nu}$ for all $0\leq t\leq T$ . That is, the participant sells at all time at full capacity.
(b)

If $P(0)\leq\Psi^{-}_{+}(T)$ , then $\nu_{*}(t)=\overline{\nu}$ for all $0\leq t\leq T$ . That is, the participant buys at all time at full capacity.
(c)

If $\Psi^{+}_{+}(T)<\Psi^{-}(0,x)$ and $\Psi^{-}_{+}(T)<P(0)<\Psi^{-}(0,x)$ , then

$\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{0},\\ -\overline{\nu}&\mbox{for }t>t_{0},\end{array}\right.$

where $t_{0}$ is the unique point in $[0,T]$ such that $\Psi^{-}_{+}(t)=P(0)$ . That is, the participant first buys and after some time sells, both at full capacity.
(d)
If $\Psi^{-}(0,x)<\Psi^{+}_{+}(T)$ , then
1. (1)
  
  if $\Psi^{-}(0,x)<P(0)<\Psi^{+}_{+}(T)$ , then $\nu_{*}(t)=-\overline{\nu}$ for all $0\leq t\leq T$ . That is, the participant sells at all time at full capacity.
2. (2)
  
  if $\Psi^{-}_{+}(T)<P(0)\leq\Psi^{-}(0,x)$ , then then
  
  $\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{0},\\ -\overline{\nu}&\mbox{for }t>t_{0},\end{array}\right.$
  
  where $t_{0}$ is the unique point in $[0,T]$ such that $\Psi^{-}_{+}(t)=P(0)$ . That is, the participant first buys and after some time sells, both at full capacity.

5. Extension: Risk Control

In the previous sections, we have focused on profit seeking objectives in which a participant’s utility increases with getting more stakes, or consuming more. In the modern finance literature, Markowitz [16] pioneered the idea of balancing return and risk in any investment, which is particularly relevant for cryptocurrency trading, which often involves substantial volatility. In this spirit, here we add to the utility objective two “cost” terms that penalize the deviation of participant $k$ ’s holding of stakes from the average of all others. The idea is, to extent this deviation measures risk (analogous to the variance in the Markowitz model), it should be the price to be paid for the utility (in holding stakes) that $k$ wants to maximize. (The same idea has been used in [13] in the context of stochastic games.) Specifically, the deviation of participant $k$ ’s holding from the average all others can be expressed as $|X_{k}(t)-\frac{N(t)}{K}|$ , taking into account $N(t)=\sum_{k=1}^{K}X_{k}(t)$ . Hence, the new objective function is:

$\displaystyle U(x):=\sup_{\{\nu(t),b(t)\}}$	$\displaystyle J(\nu,b):=\mathbb{E}\bigg{\{}\int_{0}^{\mathcal{T}}e^{-\beta t}(dc(t)+\ \ell(X(t))dt)+e^{-\beta\mathcal{T}}\left(b(\mathcal{T})+h(X(\mathcal{T})\right)$
	$\displaystyle\qquad-\int_{0}^{\mathcal{T}}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt-e^{-\delta\mathcal{T}}q\left(X(\mathcal{T})-\frac{N(\mathcal{T})}{K}\right)\bigg{\}}$	(5.1)
	$\displaystyle\mbox{ subject to }X^{\prime}(t)=\nu(t)+\frac{N^{\prime}(t)}{N(t)}X(t),\,X(0)=x,$	(C0)
	$\displaystyle\qquad\qquad\quad\,dc(t)+db(t)-rb(t)dt+P(t)\nu(t)dt=0,$	(C1)
	$\displaystyle\qquad\qquad\quad\,b(0)=0,\,b(t)\geq 0\mbox{ and }0\leq X(t)\leq N(t),$	(C2)
	$\displaystyle\qquad\qquad\quad\,\|\nu(t)\|\leq\overline{\nu},$	(C3)

where $\delta>0$ is a discount factor (which may or may not be equal to $\beta$ ), and $g:\mathbb{R}\to\mathbb{R}_{+}$ and $q:\mathbb{R}\to\mathbb{R}_{+}$ are symmetric, and increasing on $\mathbb{R}_{+}$ (a typical example is $g(x)=gx^{2}$ and $q(x)=qx^{2}$ with $g,q>0$ ).

The theorem below follows the same argument as Theorem 3.4.

Theorem 5.1.

Let the assumptions in Theorem 3.4 hold for the problem (5). Assume that $g,q\in\mathcal{C}^{1}(\mathbb{R})$ are symmetric, and increasing on $\mathbb{R}_{+}$ . Then $U(x)=v(0,x)$ where $v(t,x)$ is the unique viscosity solution to the following HJB equation:

\left\{\begin{array}[]{lcl}\partial_{t}v+e^{-\beta t}\ell(x)-e^{-\delta t}g\left(x-\frac{N(t)}{K}\right)+\frac{xN^{\prime}(t)}{N(t)}\partial_{x}v+\sup_{|\nu|\leq\overline{\nu}}\{\nu(\partial_{x}v-\widetilde{P}_{\beta}(t))\}=0\quad\mbox{in }Q,\\ v(T,x)=e^{-\beta T}h(x)-e^{-\delta T}q\left(x-\frac{N(T)}{K}\right),\\ v(t,0)=e^{-\beta t}h(0)-e^{-\delta t}q\left(\frac{N(t)}{K}\right),\,\,v(t,N(t))=e^{-\beta t}h(N(t))-e^{-\delta t}q\left(\frac{(K-1)N(t)}{K}\right).\end{array}\right.

(5.2)

Moreover, the optimal strategy is $b_{*}(t)=0$ and $\nu_{*}(t)=\nu_{*}(t,X_{*}(t))$ for $0\leq t\leq\mathcal{T}_{*}$ (if it exists), where $\nu_{*}(t,x)$ achieves the supremum in (3.19), and $X_{*}(t)$ solves $X_{*}^{\prime}(t)=\nu_{*}(t,X_{*}(t))+\frac{N^{\prime}(t)}{N(t)}X_{*}(t)$ with $X_{*}(0)=x$ , and $\mathcal{T}_{*}:=\inf\{t>0:X_{*}(t)=0\mbox{ or }N(t)\}\wedge T$ .

In general, the HJB equation (5.2) does not have a closed-form solution even when $\ell,h$ are linear, and $g,q$ are quadratic. Again it requires numerical methods to solve the HJB equation, and then find the optimal strategy $\nu_{*}$ . Nevertheless, there is one exception where the participant is only concerned with the risk entailed by the stakes. The objective is to solve the stake parity problem:

$\displaystyle U(x):=\inf_{\nu(t)}$	$\displaystyle J(\nu):=\int_{0}^{\mathcal{T}}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+e^{-\delta\mathcal{T}}q\left(X(\mathcal{T})-\frac{N(\mathcal{T})}{K}\right)$	(5.3)
	$\displaystyle\mbox{ subject to }X^{\prime}(t)=\nu(t)+\frac{N^{\prime}(t)}{N(t)}X(t),\,X(0)=x,$	(C0)
	$\displaystyle\qquad\qquad\quad\,b(0)=0,\,b(t)\geq 0\mbox{ and }0\leq X(t)\leq N(t),$	(C2’)
	$\displaystyle\qquad\qquad\quad\,\|\nu(t)\|\leq\overline{\nu}.$	(C3)

Since $g,h$ attain the minimum at $0$ , if $x\geq N/K$ , then the participant sells at full capacity until hitting the average $N(t)/K$ ; if if $x<N/K$ , then the participant purchases at full capacity until hitting the average $N(t)/K$ . We record this simple fact in the following proposition.

Proposition 5.2.

Assume that $g,q\in\mathcal{C}^{1}(\mathbb{R})$ are symmetric, and increasing on $\mathbb{R}_{+}$ for the stake parity problem (5.3). Let $\gamma_{+}(t)$ be defined by (3.7), and

\gamma_{-}(t):=-\overline{\nu}N(t)\int_{0}^{t}\frac{ds}{N(s)}+\frac{xN(t)}{N}\quad\mbox{for }0\leq t\leq T,

(5.4)

and

t_{\pm}:=\inf\left\{t>0:\overline{\nu}\int_{0}^{t}\frac{ds}{N(s)}=\pm\left(\frac{1}{K}-\frac{x}{N}\right)\right\}.

(5.5)

Then, the following results hold.

(i)

If $x>N\left(\frac{1}{K}+\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right)$ , then the optimal strategy is $\nu_{*}(t)=-\overline{\nu}$ for all $0\leq t\leq T$ , and $U(x)=\int_{0}^{T}e^{-\delta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(\gamma_{-}(T)-\frac{N(T)}{K}\right)$ .
(ii)

If $\frac{N}{K}<x\leq N\left(\frac{1}{K}+\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right)$ , then the optimal strategy is

$\nu_{*}(t)=\left\{\begin{array}[]{lcl}-\overline{\nu}&\mbox{for }t\leq t_{-},\\ 0&\mbox{for }t>t_{-},\end{array}\right.$

and $U(x)=\int_{0}^{t_{-}}e^{-\beta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+\frac{g(0)}{\delta}\left(e^{-\delta t_{-}}-e^{-\delta T}\right)+e^{-\delta T}q(0)$ .
(iii)

If $N\left(\frac{1}{K}-\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right)\leq x<\frac{N}{K}$ , then the optimal strategy is

$\nu_{*}(t)=\left\{\begin{array}[]{lcl}\overline{\nu}&\mbox{for }t\leq t_{-},\\ 0&\mbox{for }t>t_{-},\end{array}\right.$

and $U(x)=\int_{0}^{t_{+}}e^{-\beta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+\frac{g(0)}{\delta}\left(e^{-\delta t_{-}}-e^{-\delta T}\right)+e^{-\delta T}q(0)$ .
(iv)

If $x<N\left(\frac{1}{K}-\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right)$ , the the optimal strategy is $\nu_{*}(t)=\overline{\nu}$ for all $0\leq t\leq T$ , and $U(x)=\int_{0}^{T}e^{-\delta t}g\left(\gamma_{+}(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(\gamma_{+}(T)-\frac{N(T)}{K}\right)$ .

Proof.

(i) If $x>N\left(\frac{1}{K}+\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right)$ , we have $\gamma_{-}(t)>N(t)/K$ for all $0\leq t\leq T$ . By a comparison argument, we get $X(t)\geq\gamma_{-}(t)$ for all $0\leq t\leq T$ given any feasible strategy $\nu(t)$ . Since $g,q$ are increasing on $\mathbb{R}_{+}$ , we obtain

			$\displaystyle\int_{0}^{T}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(X(T)-\frac{N(T)}{K}\right)$
		$\displaystyle\geq$	$\displaystyle\int_{0}^{T}e^{-\delta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(\gamma_{-}(T)-\frac{N(T)}{K}\right),$

which yields the desired result.

(ii) If $\frac{N}{K}<x\leq N\left(\frac{1}{K}+\overline{\nu}\int_{0}^{T}\frac{dt}{N(t)}\right)$ , we have $\gamma_{-}(t)>N(t)/K$ for $0\leq t<t_{-}$ and $\gamma_{-}(t_{-})=N(t_{-})/K$ . Again by the comparison argument, $X(t)\geq\gamma_{-}(t)$ for $0\leq t\leq t_{-}$ given any strategy. Thus,

	$\displaystyle\quad\int_{0}^{T}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(X(T)-\frac{N(T)}{K}\right)$
	$\displaystyle=\int_{0}^{t_{-}}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+\int_{t_{-}}^{T}e^{-\delta t}g\left(X(t)-\frac{N(t)}{K}\right)dt+e^{-\delta T}q\left(X(T)-\frac{N(T)}{K}\right)$
	$\displaystyle\geq\int_{0}^{t_{-}}e^{-\delta t}g\left(\gamma_{-}(t)-\frac{N(t)}{K}\right)dt+g(0)\int_{t_{-}}^{T}e^{-\delta t}dt+e^{\delta T}g(0),$

which permits to conclude.

(iii) and (iv) follow the same argument as (1) and (2). ∎

6. Conclusion

We have developed in this paper a continuous-time control approach to the optimal trading under the PoS protocol, formulated as a consumption-investment problem. We present general solutions to the optimal control via dynamic programming and the HJB equations, and in the case of linear and utility functions, close-form solutions in the form of bang-bang controls. Furthermore, we bring out the explicit connections between the rate of return in trading/holding stakes and the participant’s risk-adjusted valuation of the stakes, such that the participant’s risk sensitivity is explicitly accounted for in the trading strategy. We have also studied a risk-control version of the consumption-investment problem, and for a special case, the “stake-parity” problem, we show a mean-reverting strategy is the optimal solution.

While our focus here is entirely on an individual participant’s trading strategy in a PoS protocol, it is possible to study the interactions among the participants, and formulate the problem of trading in a PoS protocol as a game (deterministic or stochastic), and to study issues such as equilibrium, social welfare, and the inclusion of a trusted third party (or market maker). This will be our focus of a follow-up paper.

Acknowledgement: W. Tang gratefully acknowledges financial support through NSF grants DMS-2113779 and DMS-2206038, and through a start-up grant at Columbia University. David Yao’s work is part of a Columbia-CityU/HK collaborative project that is supported by InnotHK Initiative, The Government of the HKSAR and the AIFT Lab.

References

[1] H. Alsabah and A. Capponi. Pitfalls of Bitcoin’s Proof-of-Work: R&D arms race and mining centralization. 2020. SSRN:3273982.
[2] L. Ambrosio. Transport equation and Cauchy problem for non-smooth vector fields. In Calculus of variations and nonlinear partial differential equations, volume 1927 of Lecture Notes in Math., pages 1–41. Springer, Berlin, 2008.
[3] N. Arnosti and S. M. Weinberg. Bitcoin: A natural oligopoly. Management Science, 2022.
[4] C. Bertucci, L. Bertucci, J.-M. Lasry, and P.-L. Lions. Mean field game approach to Bitcoin mining. 2020. arXiv:2004.08167.
[5] C. Bertucci, L. Bertucci, J.-M. Lasry, and P.-L. Lions. How resilient is the Bitcoin protocol? 2022. SSRN:3907822.
[6] V. Buterin. Toward a $12$ -second block time. 2014. Available at https://blog.ethereum.org/2014/07/11/toward-a-12-second-block-time.
[7] J. Chiu and T. V. Koeppl. The economics of cryptocurrencies–Bitcoin and beyond. 2017. SSRN:3048124.
[8] J. Chod, N. Trichakis, G. Tsoukalas, H. Aspegren, and M. Weber. On the financing benefits of supply chain transparency and blockchain adoption. Management Science, 66(10):4378–4396, 2020.
[9] F. Donovan. Healthcare blockchain could save industry $100b annually by 2025. HIT Infrastructure, 2019. Available at https://hitinfrastructure.com/news/healthcare-blockchain-could-save-industry-100b-annually-by-2025.
[10] W. Duggan and F. Powell. What is Ethereum 2.0? understanding the merge. Avalialbe at https://www.forbes.com/uk/advisor/investing/cryptocurrency/what-is-ethereum-2/, year=2022,.
[11] W. H. Fleming and H. M. Soner. Controlled Markov processes and viscosity solutions, volume 25 of Stochastic Modelling and Applied Probability. Springer, New York, second edition, 2006.
[12] F. Golse. Mean field kinetic equations. 2013. Available at http://www.cmls.polytechnique.fr/perso/golse/M2/PolyKinetic.pdf.
[13] X. Guo, W. Tang, and R. Xu. A class of stochastic games and moving free boundary problems. SIAM J. Control Optim., 60(2):758–785, 2022.
[14] S. King and S. Nadal. Ppcoin: Peer-to-peer crypto-currency with proof-of-stake. 2012. Available at https://decred.org/research/king2012.pdf.
[15] Z. Li, A. M. Reppen, and R. Sircar. A mean field games model for cryptocurrency mining. 2019. arXiv:1912.01952.
[16] H. M. Markowitz. Portfolio selection: Efficient diversification of investments. Cowles Foundation for Research, Monograph 16. John Wiley & Sons, Inc., New York, 1959.
[17] C. Mora, R. L. Rollins, K. Taladay, M. B. Kantar, M. K. Chock, M. Shimada, and E. C. Franklin. Bitcoin emissions alone could push global warming above 2 c. Nat. Clim. Change, 8(11):931–933, 2018.
[18] S. Nakamoto. Bitcoin: A peer-to-peer electronic cash system. Decentralized Business Review, page 21260, 2008.
[19] S. Osher and C.-W. Shu. High-order essentially nonoscillatory schemes for Hamilton-Jacobi equations. SIAM J. Numer. Anal., 28(4):907–922, 1991.
[20] M. Platt, J. Sedlmeir, D. Platt, P. Tasca, J. Xu, N. Vadgama, and J. I. Ibañez. Energy footprint of blockchain consensus mechanisms beyond proof-of-work. 2021. arXiv:2109.03667.
[21] I. Roşu and F. Saleh. Evolution of shares in a proof-of-stake cryptocurrency. Manag. Sci., 67(2):661–672, 2021.
[22] F. Saleh. Blockchain without waste: Proof-of-stake. The Review of Financial Studies, 34(3):1156–1190, 2021.
[23] P. E. Souganidis. Approximation schemes for viscosity solutions of Hamilton-Jacobi equations. J. Differential Equations, 59(1):1–43, 1985.
[24] W. Tang. Stability of shares in the Proof of Stake protocol – concentration and phase transitions. 2022. arXiv:2206.02227.
[25] W. Tang and D. D. Yao. Polynomial voting rules. 2022. arXiv:2206.10105.
[26] Q. Wang, R. Li, Q. Wang, and S. Chen. Non-fungible token (NFT): Overview, evaluation, opportunities and challenges. 2021. arXiv:2105.07447.
[27] A. Wood. West Virginia secretary of state reports successful blockchain voting in 2018 midterm elections. 2018. Avalialbe at https://cointelegraph.com/news/west-virginia-secretary-of-state-reports-successful-blockchain-voting-in-2018-midterm-elections.
[28] G. Wood. Ethereum: A secure decentralised generalised transaction ledger. Ethereum Project Yellow Paper, 151:1–32, 2014.

Trading under the Proof-of-Stake Protocol – a Continuous-Time Control Approach

Abstract.

1. Introduction

2. Model Formulation

Assumption 2.1.

3. The Consumption-Investment Problem

3.1. Stake-hoarding

Proposition 3.1.

Corollary 3.2.

Proof.

Lemma 3.3.

Proof.

Proof of Proposition 3.1.

3.2. Main theorem and proof

Theorem 3.4.

Proof.

4. Linear and Convex Utilities

4.1. Linear utility

Proposition 4.1.

Proof.

4.2. A special case

Proposition 4.2.

Proof.

4.3. Convex utility

Proposition 4.3.

5. Extension: Risk Control

Theorem 5.1.

Proposition 5.2.

Proof.

6. Conclusion

References

Trading under the Proof-of-Stake Protocol
– a Continuous-Time Control Approach