This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Pólya urn with memory kernel and
asymptotic behaviors of autocorrelation function

Shintaro Mori [email protected] Department of Mathematics and Physics, Faculty of Science and Technology, Hirosaki University,
Bunkyo-cho 3, Hirosaki, Aomori 036-8561, Japan
   Masato Hisakado [email protected] Nomura Holdings Inc.,
Otemachi 2-2-2, Chiyoda-ku, Tokyo 100-8130, Japan
   Kazuaki Nakayama [email protected] Department of Mathematical Sciences, Faculty of Science, Shinshu University,
Asahi 3-1-1, Matsumoto, Nagano 390-8621, Japan
Abstract

Pólya urn is a stochastic process in which balls are randomly drawn from an urn of red and blue balls, and balls of the same color as the drawn balls are added. The probability of a ball of a certain color being drawn is equal to the percentage of balls of that color in the urn. We introduce arbitrary memory kernels to modify this probability. If the memory kernel decays exponentially, it is a stationary process and is mean-reverting. If the memory kernel decays by a power-law, a phase transition occurs and the asymptotic behavior of the autocorrelation function changes. An auxiliary field variable is introduced to transform the process Markovian and the field obeys a multivariate Ornstein-Uhlenbeck process. The exponents of the power law are estimated for the decay of the leading and subleading terms of the autocorrelation function. It is shown that the power law exponents changes discontinuously at the critical point.

pacs:
05.70.Fh,89.65.Gh

I Introduction

Econophysics and socio-physics are fascinating research fields in statistical physicsMantegna and Stanley (2007); Galam (2008); Castellano et al. (2009). In particular, herding or the tendency to follow the majority has attracted many researchers as it plays a crucial role in the understanding of the phenomenaArthur (1989); Bikhchandani et al. (1992); Cont and Bouchaud (2000); Mori et al. (2016). Many types of probabilistic models have been proposed to model herding and one example is the ant recruitment model, which describes the intermittent oscillation of the ants faced with two identical food sourcesKirman (1993). It incorporates a simple herding mechanism where a randomly chosen ant chooses one of the two food sources with a probability that is proportional to the numbers of ants who have chosen the food sources.

A variant of the ant recruitment model has been adopted as a model for the intermittent switching phenomena of trading strategy, trend-follower and contrarian, in the modeling of the time-series data of stock and foreign exchange marketsLux (1995). Another application is the collective bankruptcy of assetsHisakado and Mori (2020). In the modeling of the time-series data of defaults of assets, the estimation of the default probability (PD) and the default correlation is crucial as it affects the risk estimation of the investment. In addition to the default correlation for the defaults in the same year, it is necessary to model the correlation of the defaults in different years. The beta-binomial distribution model with memory kernel is a model for both correlations and it is tractable compared to the established model based on the correlation of asset valuesPluto and Tasche (2011); Hisakado and Mori (2021).

The fascinating point of the beta-binomial distribution model with memory kernel is its phase transition, and its essence is grasped by a variant of the Pólya urnG.Pólya (1931), Pólya urn with memory kernelHisakado and Mori (2020). If the memory kernel decays with the power law, the power-law index γ\gamma determines the phases of the model. If γ>1\gamma>1, the autocorrelation function (ACF) shows power-law decay with exponent δ=γ\delta=\gamma. For γ<1\gamma<1, the process becomes nonstationary and δ=0\delta=0. If γ\gamma approaches 1, δ\delta changes discontinuously to a value δc\delta_{c}. The order parameter of the phase transition is the limit value of the ACF. In the nonstationary phase with c>0c>0 for γ<1\gamma<1, the memory of the history remains forever and the estimation of the PD and other parameters becomes impossible even if the length of the time series is infinite.

This study addresses the phase transition of the Pólya urn with an arbitrary memory kernel using the stochastic differential equation (SDE). The original Pólya urn is a Markov process as the probability that the color of a newly added ball is red depends only on the present number of red balls in the urn. The Pólya urn with exponential decay memory kernel can also be described as a Markov process when one focuses on the probability that the color of a newly added ball is red. However, if the memory kernel is a power-law decay one, it is necessary to introduce an infinite number of auxiliary variables to make the process Markov. The auxiliary field obeys the SDE with a common Wiener process. Using the formulation, we derive the self-consistent equation for the ACF. By studying the equation, we succeed to derive the power-law exponents of the leading and subleading terms of the ACF.

Our analysis is very similar to the one that studies the Hawkes process using a stochastic differential equationKanazawa and D.Sornette (2020a, b). The Hawkes process is a non-Markov self-excited point processHawkes (1971). The Hawkes process shows a stationary–nonstationary phase transition when the branching ratio exceeds one. The critical and near-critical behaviors of the probability density function of the intensity have been studied by solving the Master equationKanazawa and D.Sornette (2020a).

In our study of the Pólya urn with memory kernel, we focus on the phase transition. Two universality classes are known for the phase transitions of non-linear Pólya urnsMori and Hisakado (2015a); Nakayama and Mori (2021), which have the same scaling structure of order parameter as that of the absorption state phase transitionsHinrichsen (2000). Our goal is to understand the nature of the phase transition of the ”linear” Pólya urn with power-law memory kernel based on the behaviors of the order parameter. The organization of the paper is as follows. Section II defines the model. We review some results of the Pólya urn. In section III, we study the model using the SDE. We start from the exponential decay memory kernel model. The ACF decays exponentially and the process is stationary. Next we study the power-law decay memory kernel model. We introduce an auxiliary field to make the process Markovian and we derive the self-consistent equation for the ACF. Section IV shows the results about the asymptotic behavior of the ACF for the power-law decay memory kernel model. We study the power-law exponents of the leading and sub-leading terms of the ACF. In section V, we summarize the results.

II Model

The Pólya urn is a binary stochastic process X(t){0,1},t=1,2,X(t)\in\{0,1\},t=1,2,\cdots and the probability that X(t+1)X(t+1) takes the value 1 is

P(X(t+1)=1|X(s),s=1,,t)=α+s=1tX(s)α+β+s=1t1.P(X(t+1)=1|X(s),s=1,\cdots,t)=\frac{\alpha+\sum_{s=1}^{t}X(s)}{\alpha+\beta+\sum_{s=1}^{t}1}. (1)

α,β>0\alpha,\beta>0 are parameters and can be interpreted as the initial numbers of red (1) and blue (0) balls in the urnG.Pólya (1931). They are parameters and are not necessarily restricted to be integers. We denote α+β\alpha+\beta as θ\theta.

We define the Pólya urn with an arbitrary memory kernel d(t),t=0,1,d(t),t=0,1,\cdotsHisakado and Mori (2020). We denote the cumulative sum of d(t)d(t) as Dd(t)D_{d}(t)

Dd(t)=s=1td(s1),Dd(0)=0D_{d}(t)=\sum_{s=1}^{t}d(s-1),D_{d}(0)=0

and the weighted sum of X(t)X(t) with weights d(t)d(t) as Sd(t)S_{d}(t),

Sd(t)=s=1tX(s)d(ts),Sd(0)=0.S_{d}(t)=\sum_{s=1}^{t}X(s)d(t-s)\,,\,S_{d}(0)=0.

Zd(t)Z_{d}(t) is then defined as

Zd(t)=Sd(t)Dd(t),Zd(0)=0,Zd(1)=X(1).Z_{d}(t)=\frac{S_{d}(t)}{D_{d}(t)}\,,\,Z_{d}(0)=0,Z_{d}(1)=X(1).

The probability that X(t+1)X(t+1) takes the value 1 is

P(X(t+1)=1|X(s),s=1,,t)=α+Sd(t)θ+Dd(t)=α+Dd(t)Zd(t)θ+Dd(t).P(X(t+1)=1|X(s),s=1,\cdots,t)=\frac{\alpha+S_{d}(t)}{\theta+D_{d}(t)}=\frac{\alpha+D_{d}(t)Z_{d}(t)}{\theta+D_{d}(t)}. (2)

If one chooses d(t)=1,t=0,1,d(t)=1,t=0,1,\cdots, Dd(t)=tD_{d}(t)=t and Sd(t)=s=1tX(s)S_{d}(t)=\sum_{s=1}^{t}X(s), so that (2) reduces to (1). The probabilistic rule of (2) is linear in X(s)X(s), the expected value of X(t)X(t) is E(X(t))=α/θE(X(t))=\alpha/\theta.

We are interested in the response of X(t+1)X(t+1) by the change in the initial value X(1)X(1). The ACF C(t)C(t) is defined as the covariance of X(1)X(1) and X(t+1)X(t+1) divided by the variance of X(1)X(1) as,

C(t)E(X(1)X(t+1))(α/θ)2V(X(1).C(t)\equiv\frac{E(X(1)X(t+1))-(\alpha/\theta)^{2}}{V(X(1)}.

We have another expression for C(t)C(t) as Mori and Hisakado (2015a),

C(t)=E(X(t+1)|X(1)=1)E(X(t+1)|X(1)=0).C(t)=E(X(t+1)|X(1)=1)-E(X(t+1)|X(1)=0).

X(2)X(2) depends on X(1)X(1) and E(X(2)|X(1))=(α+d(0)X(1))/(θ+d(0))E(X(2)|X(1))=(\alpha+d(0)X(1))/(\theta+d(0)), we have

C(1)=d(0)θ+d(0).C(1)=\frac{d(0)}{\theta+d(0)}.

The order parameter cc of the phase transition of the stochastic process is defined as the limit value of C(t)C(t), c=limtC(t)c=\lim_{t\to\infty}C(t). The conditional expected value of X(t+1)X(t+1) under the condition for X(1)X(1) is the average probability that X(t+1)X(t+1) takes the value 1 under the same condition.

E(X(t+1)|X(1))=E(P(X(t+1)=1)|X(1))=α+s=1tE(X(s)|X(1))d(ts)θ+Dd(t).E(X(t+1)|X(1))=E(P(X(t+1)=1)|X(1))=\frac{\alpha+\sum_{s=1}^{t}E(X(s)|X(1))d(t-s)}{\theta+D_{d}(t)}.

As C(t)=E(X(t+1)|X(1)=1)E(X(t+1)|X(1)=0)C(t)=E(X(t+1)|X(1)=1)-E(X(t+1)|X(1)=0), we have the recursive equation for C(t)C(t) as,

C(t)=s=1tC(s1)d(ts)θ+Dd(t)=d(t1)+s=2tC(s1)d(ts)θ+Dd(t).C(t)=\frac{\sum_{s=1}^{t}C(s-1)d(t-s)}{\theta+D_{d}(t)}=\frac{d(t-1)+\sum_{s=2}^{t}C(s-1)d(t-s)}{\theta+D_{d}(t)}. (3)

One can compute C(t),t2C(t),t\geq 2 using this equation and the initial condition C(1)=d(0)/(θ+d(0))C(1)=d(0)/(\theta+d(0)).

There are several choices for the memory kernel d(t),t=0,d(t),t=0,\cdots.

  1. 1.

    Pólya urn, d(t)=1,t0d(t)=1,t\geq 0.

    As we have noted before, if one choose d(t)=1,t=0,d(t)=1,t=0,\cdots, the model is the Pólya urn. We denote Zd(t)Z_{d}(t) for this case as Z(t)Z_{\infty}(t). tZ(t)tZ_{\infty}(t) obeys a binomial distribution with shape parameters α,β\alpha,\betaG.Pólya (1931). The expected value of X(t)X(t) is E(X(t))=α/θE(X(t))=\alpha/\theta. Z()Z_{\infty}(\infty) obeys the beta distribution and the variance is 1θ+1αβθ2\frac{1}{\theta+1}\cdot\frac{\alpha\beta}{\theta^{2}}. When one is interested in the dynamics of X(t)X(t) for the initial condition X(1)=xX(1)=x, it is only necessary to alter the parameters α,β\alpha,\beta and θ\theta as α+x,β+(1x)\alpha+x,\beta+(1-x) and θ+1\theta+1, respectively. The conditional expected value E(X(t+1)|X(1)=x)E(X(t+1)|X(1)=x) is (α+x)/(θ+1)(\alpha+x)/(\theta+1) for t1t\geq 1 and ACF C(t)C(t) becomes constant.

    C(t)=1θ+1,t1.C(t)=\frac{1}{\theta+1},t\geq 1. (4)

    The process is not stationary and not mean-reverting. The conditional variance of Z()Z_{\infty}(\infty) for X(1)=xX(1)=x is

    V(Z()|X(1)=x)=1θ+2(α+x)(β+(1x))(θ+1)2.V(Z_{\infty}(\infty)|X(1)=x)=\frac{1}{\theta+2}\cdot\frac{(\alpha+x)(\beta+(1-x))}{(\theta+1)^{2}}. (5)
  2. 2.

    Finite memory kernel, d(t)=1,t<rd(t)=1,t<r and d(t)=0,trd(t)=0,t\geq r.

    There is a cutoff rr in the memory kernel, and d(t)=1d(t)=1 for 0t<r0\leq t<r, and d(t)=0d(t)=0 for trt\geq r. For t>rt>r, X(t)X(t) is influenced by the most recent rr variables X(s),s=t1,,trX(s),s=t-1,\cdots,t-r. For trt\leq r, X(t)X(t) are influenced from all past variables X(s),s=1,,t1X(s),s=1,\cdots,t-1. Zd(t)Z_{d}(t) is defined as

    Zd(t)={1rs=tr+1tX(s)tr,1ts=1tX(s)t<r.Z_{d}(t)=\left\{\begin{array}[]{cc}\frac{1}{r}\sum_{s=t-r+1}^{t}X(s)&t\geq r,\\ \frac{1}{t}\sum_{s=1}^{t}X(s)&t<r.\end{array}\right.

    The probability that Zd(t),t>rZ_{d}(t),t>r changes by ±1/r\pm 1/r for the condition Zd(t)=zdZ_{d}(t)=z_{d} is given as

    P(ΔZd(t)=+1/r|Zd(t)=zd)\displaystyle P(\Delta Z_{d}(t)=+1/r|Z_{d}(t)=z_{d}) =\displaystyle= P(X(tr+1)=0|Zd(t)=zd)P(X(t+1)=1|Zd(t)=zd),\displaystyle P(X(t-r+1)=0|Z_{d}(t)=z_{d})\cdot P(X(t+1)=1|Z_{d}(t)=z_{d}),
    P(ΔZd(t)=1/r|Zd(t)=zd)\displaystyle P(\Delta Z_{d}(t)=-1/r|Z_{d}(t)=z_{d}) =\displaystyle= P(X(tr+1)=1|Zd(t)=zd)P(X(t+1)=0|Zd(t)=zd).\displaystyle P(X(t-r+1)=1|Z_{d}(t)=z_{d})\cdot P(X(t+1)=0|Z_{d}(t)=z_{d}).

    If one adopts the mean field approximation as P(X(tr+1)=1|Zd(t)=zd)=zd(t)P(X(t-r+1)=1|Z_{d}(t)=z_{d})=z_{d}(t) and P(X(tr+1)=0|Zd(t)=zd)=1zd(t)P(X(t-r+1)=0|Z_{d}(t)=z_{d})=1-z_{d}(t), the probabilities can be written as

    P(ΔZd(t)=+1/r|Zd(t)=zd)\displaystyle P(\Delta Z_{d}(t)=+1/r|Z_{d}(t)=z_{d}) =\displaystyle= (1zd)α+rzd(t)θ+r\displaystyle(1-z_{d})\cdot\frac{\alpha+rz_{d}(t)}{\theta+r}
    P(ΔZd(t)=1/r|Zd(t)=zd)\displaystyle P(\Delta Z_{d}(t)=-1/r|Z_{d}(t)=z_{d}) =\displaystyle= zdβ+r(1zd(t))θ+r\displaystyle z_{d}\cdot\frac{\beta+r(1-z_{d}(t))}{\theta+r}

    By the approximation, the model becomes the Kirman’s ant colony modelKirman (1993). In the Kirman’s ant colony model, there are two food sources: 0,10,1 and rr ants. The numbers of ants that have chosen food sources 1 and 0 are n1n_{1} and n0n_{0}, respectively. A randomly selected ant chooses food sources 1 and 0 with a probability that is linear in n1n_{1} and n0n_{0} among r1r-1 ants. The probability that the ratio z=n1/rz=n_{1}/r changes ±1/r\pm 1/r is given as

    P(Δz=+1|n1(t)/r=z)\displaystyle P(\Delta z=+1|n_{1}(t)/r=z) =\displaystyle= (1z)α+(r1)zθ+(r1),\displaystyle(1-z)\cdot\frac{\alpha+(r-1)z}{\theta+(r-1)},
    P(Δz=1/r|n1(t)/r=z)\displaystyle P(\Delta z=-1/r|n_{1}(t)/r=z) =\displaystyle= zβ+(r1)(1z)θ+(r1).\displaystyle z\cdot\frac{\beta+(r-1)(1-z)}{\theta+(r-1)}.

    The prefactors in the right hand side of the equations are the probabilities that the randomly selected ants are from food sources 0 and 1, respectively. The remaining terms are the probabilities that the selected ant chooses food sources 1 and 0, respectively. The α,β\alpha,\beta are noises in the decision of the ants. The Pólya urn with finite memory kernel corresponds to the ant colony model in the mean filed approximation.

    The stationary distribution of n1n_{1} is a beta binomial distribution, with shape parameters α,β\alpha,\beta. C(t)C(t) decays exponentially as C(t)exp(t/ξ)C(t)\sim\exp(-t/\xi), which defines the correlation length ξ\xi. ξ\xi depends on the length of the memory rr and diverges in the limit rr\to\inftyMori and Hisakado (2015b).

  3. 3.

    Exponential decay memory kernel, d(t)=ertd(t)=e^{-rt}.

    A previous study reveals that C(t)C(t) decays exponentially as C(t)et/ξC(t)\sim e^{-t/\xi} for t>>1t>>1Hisakado and Mori (2020). The correlation length ξ\xi is determined by rr,

    1/ξ=ln(θ(1er)+1θer(1er)+1)>0,1/\xi=\ln\left(\frac{\theta(1-e^{-r})+1}{\theta e^{-r}(1-e^{-r})+1}\right)>0,

    and ξ\xi diverges in the limit r0r\to 0. In the limit rr\to\infty, we have d(t)=0d(t)=0 for t>0t>0. X(t+1)X(t+1) depends only on X(t)X(t), which corresponds to the correlated random walkBo¨\ddot{o}hm (2000).

  4. 4.

    Power-law decay memory kernel, d(t)=(1+t)γd(t)=(1+t)^{-\gamma}.

    The model shows a phase transition at γ=1\gamma=1Hisakado and Mori (2020). By the numerical study of the ACF C(t)C(t), it was conjectured that the asymptotic behavior of C(t)C(t) is

    C(t){ctγγ>1,ctδcγ=1,ct0γ<1.C(t)\sim\left\{\begin{array}[]{cc}ct^{-\gamma}&\gamma>1,\\ ct^{-\delta_{c}}&\gamma=1,\\ ct^{0}&\gamma<1.\end{array}\right.

    The critical power-law exponent δc\delta_{c} at γ=1\gamma=1 is given as the solution of the next equation,

    θ=limt(1/(t+1)t/(t+1)xδc(1x)1𝑑xlnt).\theta=\lim_{t\to\infty}\left(\int_{1/(t+1)}^{t/(t+1)}x^{-\delta_{c}}(1-x)^{-1}dx-\ln t\right). (6)

III Stochastic differential equation

We study the Pólya urn with memory kernel using the method of stochastic approximation Gardiner (2009); Renlund (2010). We start from the exponential decay memory kernel model. For the reader’s convenience, we summarize the results of the analysis of the original Pólya urn using the SDE in Appendix A.

III.1 Exponential decay memory kernel model : d(t)=ertd(t)=e^{-rt}

We denote Dd(t),Sd(t),Zd(t)D_{d}(t),S_{d}(t),Z_{d}(t) for d(t)=ertd(t)=e^{-rt} as D(r,t)=s=1ter(ts)D(r,t)=\sum_{s=1}^{t}e^{-r(t-s)},S(r,t)=s=1tX(s)er(ts)S(r,t)=\sum_{s=1}^{t}X(s)e^{-r(t-s)}, Z(r,t)=S(r,t)/D(r,t)Z(r,t)=S(r,t)/D(r,t), respectively. We decompose S(r,t+1)S(r,t+1) and D(r,t+1)D(r,t+1) as

D(r,t+1)\displaystyle D(r,t+1) =\displaystyle= D(r,t)er+1,\displaystyle D(r,t)e^{-r}+1,
S(r,t+1)\displaystyle S(r,t+1) =\displaystyle= s=1t+1X(s)er(t+1s)=S(r,t)er+X(t+1).\displaystyle\sum_{s=1}^{t+1}X(s)e^{-r(t+1-s)}=S(r,t)e^{-r}+X(t+1).

Z(r,t+1)Z(r,t+1) can be decomposed as

Z(r,t+1)=S(r,t+1)D(r,t+1)=Z(r,t)+X(t+1)Z(r,t)D(r,t+1).Z(r,t+1)=\frac{S(r,t+1)}{D(r,t+1)}=Z(r,t)+\frac{X(t+1)-Z(r,t)}{D(r,t+1)}. (7)

For t>>1t>>1, D(r,t)=(1ert)/(1er)1/(1er)D(r,t)=(1-e^{-rt})/(1-e^{-r})\simeq 1/(1-e^{-r}) and we write Dr=1/(1er)D_{r}=1/(1-e^{-r}). The conditional expected value and the variance of Z(r,t+1)Z(r,t+1) for Z(r,t)=zrZ(r,t)=z_{r} is

E(Z(r,t+1)|Z(r,t)=zr)zr\displaystyle E(Z(r,t+1)|Z(r,t)=z_{r})-z_{r} =\displaystyle= E(X(t+1)|Z(r,t)=zr)zrD(r,t+1)αθzrDr(Dr+θ)\displaystyle\frac{E(X(t+1)|Z(r,t)=z_{r})-z_{r}}{D(r,t+1)}\simeq\frac{\alpha-\theta z_{r}}{D_{r}(D_{r}+\theta)}
V(Z(r,t+1)|Z(r,t)=zr)\displaystyle V(Z(r,t+1)|Z(r,t)=z_{r}) =\displaystyle= 1D(r,t+1)2V(X(t+1)|Z(r,t)=zr)\displaystyle\frac{1}{D(r,t+1)^{2}}V(X(t+1)|Z(r,t)=z_{r})
\displaystyle\simeq 1Dr2(α+Drzr)(β+Dr(1zr))(Dr+θ)2.\displaystyle\frac{1}{D_{r}^{2}}\cdot\frac{(\alpha+D_{r}z_{r})(\beta+D_{r}(1-z_{r}))}{(D_{r}+\theta)^{2}}.

Based on the assumption that ACF decays with tt rapidly, Z(r,t)Z(r,t) fluctuates around its mean value and E(Z(r,t))=α/θE(Z(r,t))=\alpha/\theta, we approximate the conditional variance of Z(r,t+1)Z(r,t+1) for Z(r,t)=zrZ(r,t)=z_{r} by replacing zrz_{r} with α/θ\alpha/\theta as,

V(Z(r,t+1)|Z(r,t)=zr)B2Dr2,B2=αβθ2.V(Z(r,t+1)|Z(r,t)=z_{r})\simeq\frac{B^{2}}{D_{r}^{2}},B^{2}=\frac{\alpha\beta}{\theta^{2}}.

One reads the drift and diffusion terms from the conditional expected value and the variance, and the SDE becomes

dZ(r,t)=Ar(αθZ(r,t))dt+BrdWtdZ(r,t)=A_{r}(\alpha-\theta Z(r,t))dt+B_{r}dW_{t}

Ar,BrA_{r},B_{r} are defined as,

Ar=1Dr(Dr+θ),Br=BDr.A_{r}=\frac{1}{D_{r}(D_{r}+\theta)},B_{r}=\frac{B}{D_{r}}.

It is the Ornstein-Uhlenbeck process. The condition of mean-reverting Arθ>0A_{r}\theta>0 is satisfied and Z(r,t)Z(r,t) fluctuates around α/θ\alpha/\thetaGardiner (2009); Vatiwutipong and Phewchean (2019).

The solution for the SDE with the initial condition Z(r,1)=X(1)=xZ(r,1)=X(1)=x is

Z(r,t)=αθ(1eθAr(t1))+xeθAr(t1)+BreθAr(t1)1teθAr(s1)𝑑Ws.Z(r,t)=\frac{\alpha}{\theta}(1-e^{-\theta A_{r}(t-1)})+x\cdot e^{-\theta A_{r}(t-1)}+B_{r}e^{-\theta A_{r}(t-1)}\int_{1}^{t}e^{\theta A_{r}(s-1)}dW_{s}.

The conditional expected value and the variance of Z(r,t)Z(r,t) for the initial condition X(1)=xX(1)=x are

E(Z(r,t)|X(1)=x)\displaystyle E(Z(r,t)|X(1)=x) =\displaystyle= αθ(1eθAr(t1))+xeθAr(t1),\displaystyle\frac{\alpha}{\theta}(1-e^{-\theta A_{r}(t-1)})+x\cdot e^{-\theta A_{r}(t-1)},
V(Z(r,t)|X(1)=x)\displaystyle V(Z(r,t)|X(1)=x) =\displaystyle= Br22θAr(1eθAr(t1))Br22θAr.\displaystyle\frac{B_{r}^{2}}{2\theta A_{r}}(1-e^{-\theta A_{r}(t-1)})\simeq\frac{B_{r}^{2}}{2\theta A_{r}}.

The conditional variance does not depend on the initial condition. For r<<1r<<1, Dr1/rD_{r}\simeq 1/r, Ar1/Dr2=r2,Br2r2B2A_{r}\simeq 1/D_{r}^{2}=r^{2},B_{r}^{2}\simeq r^{2}B^{2} and V(Z(r,t)|X(1)=x)B2/(2θ)V(Z(r,t)|X(1)=x)\simeq B^{2}/(2\theta). The ACF is estimated as

C(t)\displaystyle C(t) =\displaystyle= E(X(t+1)|X(1)=1)E(X(t+1)|X(1)=0)\displaystyle E(X(t+1)|X(1)=1)-E(X(t+1)|X(1)=0)
=\displaystyle= E(P(X(t+1)=1)|X(1)=1)E(P(X(t+1)=1)|X(1)=0)\displaystyle E(P(X(t+1)=1)|X(1)=1)-E(P(X(t+1)=1)|X(1)=0)
=\displaystyle= DrE(Z(r,t)|Z(1)=1)+αDr+θDrE(Z(r,t)|X(1)=0)+αDr+θ\displaystyle\frac{D_{r}E(Z(r,t)|Z(1)=1)+\alpha}{D_{r}+\theta}-\frac{D_{r}E(Z(r,t)|X(1)=0)+\alpha}{D_{r}+\theta}
=\displaystyle= DrDr+θeθAr(t1)=DrDr+θeθ(Dr+θ)Drt.\displaystyle\frac{D_{r}}{D_{r}+\theta}e^{-\theta A_{r}(t-1)}=\frac{D_{r}}{D_{r}+\theta}e^{-\frac{\theta}{(D_{r}+\theta)D_{r}}t}.

C(t)C(t) decays exponentially and C(t)eθr2tC(t)\propto e^{-\theta r^{2}t} for r<<1r<<1. Z(r,t)Z(r,t) forgets the initial value X(1)X(1) and fluctuates around α/θ\alpha/\theta with variance B2/(2θ)B^{2}/(2\theta) for r<<1r<<1. In the limit r0r\to 0, D(r,t)tD(r,t)\to t, and C(t)C(t) becomes constant. The stochastic process remembers its initial condition X(1)X(1) and the process is not stationary. A discontinuous phase transition occurs at r=0r=0.

III.2 Power-law decay memory kernel model : d(t)=(t+1)γd(t)=(t+1)^{-\gamma}

We denote Dd(t),Sd(t),Zd(t)D_{d}(t),S_{d}(t),Z_{d}(t) for d(t)=(t+1)γd(t)=(t+1)^{-\gamma} as Dγ(t)=s=1t(ts+1)γD_{\gamma}(t)=\sum_{s=1}^{t}(t-s+1)^{-\gamma}, Sγ(t)=s=1tX(s)(ts+1)γS_{\gamma}(t)=\sum_{s=1}^{t}X(s)(t-s+1)^{-\gamma}, Zγ(t)=Sγ(t)/Dγ(t)Z_{\gamma}(t)=S_{\gamma}(t)/D_{\gamma}(t), respectively. Different from the exponential decay memory kernel model, it is impossible to decompose Zγ(t+1)Z_{\gamma}(t+1) with Zγ(t)Z_{\gamma}(t) and X(t+1)X(t+1) directly. We introduce an auxiliary field {Z(r,t),0<r<}\{Z(r,t),0<r<\infty\} and express Zγ(t)Z_{\gamma}(t) as the superposition of the field. Using the decomposition of Z(r,t+1)Z(r,t+1) with Z(r,t)Z(r,t) and X(t+1)X(t+1) of (7), we derive the SDE for the auxiliary field Z(r,t)Z(r,t) and estimate Zγ(t)Z_{\gamma}(t).

First, we express (t+1)γ(t+1)^{-\gamma} as the superposition of erte^{-rt} with the gamma distribution fγ(r)=rγ1er/Γ(γ)f_{\gamma}(r)=r^{\gamma-1}e^{-r}/\Gamma(\gamma) as

(t+1)γ=0ertfγ(r)𝑑r.(t+1)^{-\gamma}=\int_{0}^{\infty}e^{-rt}\cdot f_{\gamma}(r)dr. (8)

We also express Dγ(t)D_{\gamma}(t) and Sγ(t)S_{\gamma}(t) as the superposition of D(r,t)D(r,t) and S(r,t)S(r,t) with fγ(r)f_{\gamma}(r).

Dγ(t)\displaystyle D_{\gamma}(t) =\displaystyle= 0D(r,t)fγ(r)𝑑r\displaystyle\int_{0}^{\infty}D(r,t)f_{\gamma}(r)dr
Sγ(t)\displaystyle S_{\gamma}(t) =\displaystyle= 0S(r,t)fγ(r)𝑑r=0D(r,t)Z(r,t)fγ(r)𝑑r\displaystyle\int_{0}^{\infty}S(r,t)f_{\gamma}(r)dr=\int_{0}^{\infty}D(r,t)Z(r,t)f_{\gamma}(r)dr

Zγ(t)Z_{\gamma}(t) is then written as

Zγ(t)=Sγ(t)Dγ(t)=0Z(r,t)D(r,t)fγ(r)𝑑r0D(r,t)fγ(r)𝑑r.Z_{\gamma}(t)=\frac{S_{\gamma}(t)}{D_{\gamma}(t)}=\frac{\int_{0}^{\infty}Z(r,t)D(r,t)f_{\gamma}(r)dr}{\int_{0}^{\infty}D(r,t)f_{\gamma}(r)dr}.

We denote the average of function A(r,t)A(r,t) with the weight D(r,t)fγ(r)D(r,t)f_{\gamma}(r) by <A(r,t)><A(r,t)>,

<A(r,t)>0A(r,t)D(r,t)fγ(r)𝑑rDγ(t).<A(r,t)>\equiv\frac{\int_{0}^{\infty}A(r,t)D(r,t)f_{\gamma}(r)dr}{D_{\gamma}(t)}.

The denominator ensures the identity <1>=1<1>=1 and Zγ(t)Z_{\gamma}(t) is written as <Z(r,t)><Z(r,t)>.

Next we derive the SDE for the auxiliary field {Z(r,t),0<r<}\{Z(r,t),0<r<\infty\}. When {Z(r,t)=z(r),0<r<}\{Z(r,t)=z(r),0<r<\infty\}, Zγ(t)Z_{\gamma}(t) is estimated as Zγ(t)=zγ<z(r)>Z_{\gamma}(t)=z_{\gamma}\equiv<z(r)>. The probability that X(t+1)X(t+1) takes the value 1 with the condition {Z(r,t)=z(r),0<r<}\{Z(r,t)=z(r),0<r<\infty\} is then given as

P(X(t+1)=1|{Z(r,t)=z(r),0<r<})=α+Dγ(t)<z(r)>θ+Dγ(t)=α+Dγ(t)zγθ+Dγ(t).P(X(t+1)=1|\{Z(r,t)=z(r),0<r<\infty\})=\frac{\alpha+D_{\gamma}(t)<z(r)>}{\theta+D_{\gamma}(t)}=\frac{\alpha+D_{\gamma}(t)z_{\gamma}}{\theta+D_{\gamma}(t)}.

As Z(r,t+1)=Z(r,t)+X(t+1)Z(r,t)D(r,t+1)Z(r,t+1)=Z(r,t)+\frac{X(t+1)-Z(r,t)}{D(r,t+1)} in (7), we have,

E(Z(r,t+1)|{Z(r,t)=z(r),0<r<})=z(r)+E(X(t+1)|Zγ(t)=zγ)z(r)D(r,t+1)\displaystyle E(Z(r,t+1)|\{Z(r,t)=z(r),0<r<\infty\})=z(r)+\frac{E(X(t+1)|Z_{\gamma}(t)=z_{\gamma})-z(r)}{D(r,t+1)} (9)
=\displaystyle= z(r)+θz0+Dγ(t)<z(r)>(θ+Dγ(t))z(r)(θ+Dγ(t))D(r,t+1).\displaystyle z(r)+\frac{\theta z_{0}+D_{\gamma}(t)<z(r)>-(\theta+D_{\gamma}(t))z(r)}{(\theta+D_{\gamma}(t))D(r,t+1)}.

Here, we define z0=α/θz_{0}=\alpha/\theta. We denote the weighted average of z0z_{0} and zγ(t)=<z(r,t)>z_{\gamma}(t)=<z(r,t)> with the weights θ\theta and Dγ(t)D_{\gamma}(t) as z(t)z(t),

z(t)θz0+Dγ(t)zγ(t)θ+Dγ(t)=θz0+Dγ(t)<z(r,t)>θ+Dγ(t).z(t)\equiv\frac{\theta z_{0}+D_{\gamma}(t)z_{\gamma}(t)}{\theta+D_{\gamma}(t)}=\frac{\theta z_{0}+D_{\gamma}(t)<z(r,t)>}{\theta+D_{\gamma}(t)}. (10)

Then, the conditional expected value is written as

E(Z(r,t+1)|{Z(r,t)=z(r),0<r<})=z(r)+z(t)z(r)Dr(t+1).E(Z(r,t+1)|\{Z(r,t)=z(r),0<r<\infty\})=z(r)+\frac{z(t)-z(r)}{D_{r}(t+1)}.

Likewise, we estimate the conditional variance as

V(Z(r,t+1)|{Zr(t)=zr,0<r<})=V(X(t+1)|{Z(r,t)=z(r),0<r<})Dr(t+1)2\displaystyle V(Z(r,t+1)|\{Z_{r}(t)=z_{r},0<r<\infty\})=\frac{V(X(t+1)|\{Z(r,t)=z(r),0<r<\infty\})}{D_{r}(t+1)^{2}}
=\displaystyle= (Dγ(t)zγ+θz0)(Dγ(t)(1zγ)+θ(1z0))Dr(t+1)2(Dγ(t+1)+θ)2=z(t)(1z(t))Dr(t+1)2.\displaystyle\frac{(D_{\gamma}(t)z_{\gamma}+\theta z_{0})(D_{\gamma}(t)(1-z_{\gamma})+\theta(1-z_{0}))}{D_{r}(t+1)^{2}(D_{\gamma}(t+1)+\theta)^{2}}=\frac{z(t)(1-z(t))}{D_{r}(t+1)^{2}}.

We assume that z(t) does not converge to 0 nor 1 and the numerator of the variance does not vanish. We can replace the numerator of the variance with constant B2B^{2} for simplicity without affecting the essence of the process. One reads the drift and the diffusion terms and the SDE for the auxiliary field {Z(r,t),0<r<}\{Z(r,t),0<r<\infty\} becomes

dZ(r,t)=Z(t)Z(r,t)Dr(t)+BDr(t)dWt,0<r<..dZ(r,t)=\frac{Z(t)-Z(r,t)}{D_{r}(t)}+\frac{B}{D_{r}(t)}dW_{t},0<r<\infty.. (11)

We note that the auxiliary field {Z(r,t),0<r<}\{Z(r,t),0<r<\infty\} shares the common Wiener process WtW_{t} as it emerges from X(t+1)X(t+1).

The ”formal” solution for the auxiliary field {Z(r,t)}\{Z(r,t)\} to the initial value problem {Z(r,1)=X(1)=x,0<r<}\{Z(r,1)=X(1)=x,0<r<\infty\} is

Z(r,t)=I(r,t)x+I(r,t)1tZ(s)D(r,s)I(r,s)1𝑑s+I(r,t)0tBD(r,s)I(r,s)1𝑑Ws,0<r<.Z(r,t)=I(r,t)\cdot x+I(r,t)\int_{1}^{t}\frac{Z(s)}{D(r,s)}I(r,s)^{-1}ds+I(r,t)\int_{0}^{t}\frac{B}{D(r,s)}I(r,s)^{-1}dW_{s},0<r<\infty.

Here, I(r,t)I(r,t) is defined as

I(r,t)exp(1t1D(r,s)𝑑s)=(ert1er1)1err.I(r,t)\equiv\exp\left(-\int_{1}^{t}\frac{1}{D(r,s)}ds\right)=\left(\frac{e^{rt}-1}{e^{r}-1}\right)^{-\frac{1-e^{-r}}{r}}.

The reason to call the solution ”formal” is that Z(s)Z(s) in the integral of the second term of the solution is calculated using {Z(r,t),0<r<}\{Z(r,t),0<r<\infty\}. We consider the deviation of Z(r,t)Z(r,t) from z0z_{0} as ΔZ(r,t)\Delta Z(r,t), i.e., Z(r,t)=z0+ΔZ(r,t)Z(r,t)=z_{0}+\Delta Z(r,t). ΔZγ(t)\Delta Z_{\gamma}(t) is also defined as ΔZγ(t)=<ΔZ(r,t)>=<Z(r,t)z0>=Zγ(t)z0\Delta Z_{\gamma}(t)=<\Delta Z(r,t)>=<Z(r,t)-z_{0}>=Z_{\gamma}(t)-z_{0}. Z(t)Z(t) is then written as

Z(t)=z0+Dγ(t)ΔZγ(t)θ+Dγ(t).Z(t)=z_{0}+\frac{D_{\gamma}(t)\Delta Z_{\gamma}(t)}{\theta+D_{\gamma}(t)}.

The formal solution for {Z(r,t)}\{Z(r,t)\} becomes

Z(r,t)\displaystyle Z(r,t) =\displaystyle= I(r,t)x+(1I(r,t))z0\displaystyle I(r,t)\cdot x+(1-I(r,t))\cdot z_{0}
+\displaystyle+ I(r,t)(1t1D(r,s)(Dγ(s)Δzγ(s)θ+Dγ(s))I(r,s)1𝑑s+0tBD(r,s)I(r,s)1𝑑Ws),0<r<.\displaystyle I(r,t)\left(\int_{1}^{t}\frac{1}{D(r,s)}\left(\frac{D_{\gamma}(s)\Delta z_{\gamma}(s)}{\theta+D_{\gamma}(s)}\right)I(r,s)^{-1}ds+\int_{0}^{t}\frac{B}{D(r,s)}I(r,s)^{-1}dW_{s}\right),0<r<\infty.

Zγ(t)=<Z(r,t)>Z_{\gamma}(t)=<Z(r,t)> is then estimated as

Zγ(t)\displaystyle Z_{\gamma}(t) =\displaystyle= z0+<Ir(t)>(xz0)\displaystyle z_{0}+<I_{r}(t)>(x-z_{0})
+\displaystyle+ I(r,t)(1t1D(r,s)(Dγ(s)ΔZγ(s)θ+Dγ(s))I(r,s)1𝑑s+0tBD(r,s)I(r,s)1𝑑Ws).\displaystyle\left<I(r,t)(\int_{1}^{t}\frac{1}{D(r,s)}\left(\frac{D_{\gamma}(s)\Delta Z_{\gamma}(s)}{\theta+D_{\gamma}(s)}\right)I(r,s)^{-1}ds+\int_{0}^{t}\frac{B}{D(r,s)}I(r,s)^{-1}dW_{s})\right>.

To estimate the average <><> in the right hand side of the solution for Zγ(t)Z_{\gamma}(t), we derive a useful approximation integral formula for the arbitrary function f(r)f(r), which is valid for large tt.

0f(r)fγ(r)ert𝑑r=01/(t+1)f(r)γrγ1𝑑r.\int_{0}^{\infty}f(r)f_{\gamma}(r)e^{-rt}dr=\int_{0}^{1/(t+1)}f(r)\cdot\gamma r^{\gamma-1}dr.

It replaces the definite integral of f(r)f(r) multiplied by fγ(r)ertf_{\gamma}(r)e^{-rt} over r[0,)r\in[0,\infty) with the definite integral of f(r)f(r) multiplied by γrγ1\gamma r^{\gamma-1} over r[0,1/(t+1)]r\in[0,1/(t+1)]. For large tt, ert<<1e^{-rt}<<1 for r>>1/tr>>1/t and f(r)f(r) with r0r\sim 0 dominates the asymptotic behavior. In the limit r0r\to 0, fγ(r)f_{\gamma}(r) can be replaced with γrγ1\gamma r^{\gamma-1}. Here, we adopt γrγ1\gamma r^{\gamma-1} instead of rγ1/Γ(γ)r^{\gamma-1}/\Gamma(\gamma) to ensure that the approximation formula gives (t+1)γ(t+1)^{-\gamma} for f(r)=1f(r)=1, which is the identity of (8). The formula is useful to study the asymptotic (large tt) behavior of the left-hand side of the equation.

Then, we have two approximation formulas,

<Ir(t)>\displaystyle<I_{r}(t)> \displaystyle\simeq tγDγ(t),\displaystyle\frac{t^{-\gamma}}{D_{\gamma}(t)},
<Ir(t)1Dr(s)Ir(s)1>\displaystyle<I_{r}(t)\frac{1}{D_{r}(s)}I_{r}(s)^{-1}> \displaystyle\simeq (ts+1)γDγ(t).\displaystyle\frac{(t-s+1)^{-\gamma}}{D_{\gamma}(t)}.

Using the results, we have

Zγ(t)\displaystyle Z_{\gamma}(t) =\displaystyle= z0+tγDγ(t)(xz0)\displaystyle z_{0}+\frac{t^{-\gamma}}{D_{\gamma}(t)}(x-z_{0})
+\displaystyle+ 1Dγ(t)(1t(ts+1)γ(Dγ(s)Δzγ(s)θ+Dγ(s))𝑑s+1t(ts+1)γB𝑑Ws).\displaystyle\frac{1}{D_{\gamma}(t)}\left(\int_{1}^{t}(t-s+1)^{-\gamma}\left(\frac{D_{\gamma}(s)\Delta z_{\gamma}(s)}{\theta+D_{\gamma}(s)}\right)ds+\int_{1}^{t}(t-s+1)^{-\gamma}BdW_{s}\right).

The conditional expected value of Zγ(t)Z_{\gamma}(t) for X(1)=xX(1)=x is

E(Zγ(t)|X(1)=x)\displaystyle E(Z_{\gamma}(t)|X(1)=x) =\displaystyle= z0+tγDγ(t)(xz0)\displaystyle z_{0}+\frac{t^{-\gamma}}{D_{\gamma}(t)}(x-z_{0})
+\displaystyle+ 1Dγ(t)(1t(ts+1)γ(Dγ(s)(E(Zγ(s)|X(1)=x)z0)θ+Dγ(s))𝑑s).\displaystyle\frac{1}{D_{\gamma}(t)}\left(\int_{1}^{t}(t-s+1)^{-\gamma}\left(\frac{D_{\gamma}(s)(E(Z_{\gamma}(s)|X(1)=x)-z_{0})}{\theta+D_{\gamma}(s)}\right)ds\right).

C(t)C(t) is then given as

C(t)=1Dγ(t)+θ(tγ+1t(ts+1)γC(s)).C(t)=\frac{1}{D_{\gamma}(t)+\theta}\left(t^{-\gamma}+\int_{1}^{t}(t-s+1)^{-\gamma}C(s)\right). (12)

We derive the integral equation for C(t)C(t), which is the continuous approximation of (3).

IV Asymptotic behavior of Correlation function

We study the asymptotic behavior of the ACF for the power-law decay memory kernel model. We assume the power-law asymptotic behavior of the leading and sub-leading terms of the ACF as

C(t)=ctδ+ctδ,δ>δfort>>1.C(t)=ct^{-\delta}+c^{\prime}t^{-\delta^{\prime}},\delta^{\prime}>\delta\,\,\mbox{for}\,\,t>>1. (13)

We multiply (Dγ(t)+θ)(D_{\gamma}(t)+\theta) on both sides of (12) and we obtain

(Dγ(t)+θ)C(t)=(tγ+1t(ts+1)γC(s)𝑑s).(D_{\gamma}(t)+\theta)C(t)=\left(t^{-\gamma}+\int_{1}^{t}(t-s+1)^{-\gamma}C(s)ds\right). (14)

We substitute (13) into (12) and compare the power-law exponent of the leading and subleading terms on both sides of the equation.

The asymptotic behaviors of Dγ(t)D_{\gamma}(t) are 1/(γ1),lnt1/(\gamma-1),\ln t and t1γ/(1γ)t^{1-\gamma}/(1-\gamma) for γ>1,=1\gamma>1,=1 and <1<1, respectively. We put ctδ+ctδct^{-\delta}+c^{\prime}t^{-\delta^{\prime}} in C(t)C(t) and estimate the integral of the right hand side of (14).

1t(ts+1)γ(csδ+csδ)𝑑s.\int_{1}^{t}(t-s+1)^{-\gamma}(cs^{-\delta}+cs^{-\delta^{\prime}})ds.

After changing the variable from ss to x=s/(t+1)x=s/(t+1), the first term of the integral becomes

c(t+1)1δγϵ1ϵ(1x)γxδ𝑑x.c(t+1)^{1-\delta-\gamma}\int_{\epsilon}^{1-\epsilon}(1-x)^{-\gamma}x^{-\delta}dx.

Here we write ϵ=1/(t+1)\epsilon=1/(t+1). If the exponents γ,δ\gamma,\delta of x,1xx,1-x in the integrand are smaller than 1, the integral converges to the beta function, B(1γ,1δ)B(1-\gamma,1-\delta), in the limit ϵ0\epsilon\to 0. If γ1\gamma\geq 1 or δ1\delta\geq 1, it is necessary to regularize the integral. We denote the integral as B(1γ,1δ,ϵ)B(1-\gamma,1-\delta,\epsilon),

B(1γ,1δ,ϵ)ϵ1ϵ(1x)γxδ𝑑x.B(1-\gamma,1-\delta,\epsilon)\equiv\int_{\epsilon}^{1-\epsilon}(1-x)^{-\gamma}x^{-\delta}dx.

We write the finite term of B(1γ,1δ,ϵ)B(1-\gamma,1-\delta,\epsilon) in the limit ϵ\epsilon as Breg(1γ,1δ)B_{reg}(1-\gamma,1-\delta).

B(1γ,1δ,ϵ)=Breg(1γ,1δ)+divergent terms.B(1-\gamma,1-\delta,\epsilon)=B_{reg}(1-\gamma,1-\delta)+\mbox{divergent terms}.

When γ<1,δ<1\gamma<1,\delta<1, the integral converges and Breg(1γ,1δ)B_{reg}(1-\gamma,1-\delta) is B(1γ,1δ)B(1-\gamma,1-\delta).

The regularization procedure is simple. When γ>1\gamma>1 and δ<1\delta<1, the integral of (1x)γxδ(1-x)^{-\gamma}x^{-\delta} diverges when the upper bound 1ϵ1-\epsilon goes to 1. We regularize the integral as

B(1γ,1δ,ϵ)\displaystyle B(1-\gamma,1-\delta,\epsilon) =\displaystyle= 1/ϵ1ϵ(1x)γ(xδ1)𝑑x+1/ϵ1ϵ(1x)γ𝑑x\displaystyle\int_{1/\epsilon}^{1-\epsilon}(1-x)^{-\gamma}(x^{-\delta}-1)dx+\int_{1/\epsilon}^{1-\epsilon}(1-x)^{-\gamma}dx
=\displaystyle= Breg(1γ,1δ,ϵ)+1γ1ϵ1γ.\displaystyle B_{reg}(1-\gamma,1-\delta,\epsilon)+\frac{1}{\gamma-1}\epsilon^{1-\gamma}.

Here we neglect a term that vanishes in the limit ϵ0\epsilon\to 0. The second term diverges as ϵ1γtγ1\epsilon^{1-\gamma}\propto t^{\gamma-1}. Likewise for γ>1,δ>1\gamma>1,\delta>1, the divergence of the integral of (1x)γxδ(1-x)^{-\gamma}x^{-\delta} comes from both bounds of the integral. We regularize the integral as

B(1γ,1δ,ϵ)\displaystyle B(1-\gamma,1-\delta,\epsilon) =\displaystyle= 1/ϵ1ϵ(((1x)γ1)(xdelta1)1)𝑑x+1/ϵ1ϵ(1x)γ𝑑x+1/ϵ1ϵxδ𝑑x\displaystyle\int_{1/\epsilon}^{1-\epsilon}(((1-x)^{-\gamma}-1)(x^{-delta}-1)-1)dx+\int_{1/\epsilon}^{1-\epsilon}(1-x)^{-\gamma}dx+\int_{1/\epsilon}^{1-\epsilon}x^{-\delta}dx
=\displaystyle= Breg(1γ,1δ,ϵ)+1γ1ϵ1γ+1δ1ϵ1δ.\displaystyle B_{reg}(1-\gamma,1-\delta,\epsilon)+\frac{1}{\gamma-1}\epsilon^{1-\gamma}+\frac{1}{\delta-1}\epsilon^{1-\delta}.

When γ=1\gamma=1 and δ<1\delta<1, a logarithmic divergence appears, and we have

B(0,1δ,ϵ)\displaystyle B(0,1-\delta,\epsilon) =\displaystyle= 1/ϵ1ϵ((1x)1)(xdelta1)𝑑x+1/ϵ1ϵ(1x)1𝑑x\displaystyle\int_{1/\epsilon}^{1-\epsilon}((1-x)^{-1})(x^{-delta}-1)dx+\int_{1/\epsilon}^{1-\epsilon}(1-x)^{-1}dx
=\displaystyle= Breg(0,1δ,ϵ)lnϵ.\displaystyle B_{reg}(0,1-\delta,\epsilon)-\ln\epsilon.

Using B(1γ,1δ),B(1γ,1δ)B(1-\gamma,1-\delta),B(1-\gamma,1-\delta^{\prime}), (14) becomes

(Dγ(t)+θ)(ctδ+ctδ)=tγ+t1γδB(1γ,1δ,ϵ)+t1γδB(1γ,1δ,ϵ).(D_{\gamma}(t)+\theta)(ct^{-\delta}+c^{\prime}t^{-\delta^{\prime}})=t^{-\gamma}+t^{1-\gamma-\delta}B(1-\gamma,1-\delta,\epsilon)+t^{1-\gamma-\delta^{\prime}}B(1-\gamma,1-\delta^{\prime},\epsilon). (15)
  1. 1.

    γ<1\gamma<1 case.

    Dγ(t)=t1γ/(1γ)D_{\gamma}(t)=t^{1-\gamma}/(1-\gamma) and we assume δ<1\delta<1. (15) becomes,

    (θ+t1γ/(1γ))(ctδ+ctδ)\displaystyle(\theta+t^{1-\gamma}/(1-\gamma))(ct^{-\delta}+c^{\prime}t^{-\delta^{\prime}}) =\displaystyle= tγ+ct1γδB(1γ,1δ)\displaystyle t^{-\gamma}+ct^{1-\gamma-\delta}B(1-\gamma,1-\delta)
    +\displaystyle+ ct1γδ(Breg(1γ,1δ)+1δ1tδ1).\displaystyle c^{\prime}t^{1-\gamma-\delta^{\prime}}(B_{reg}(1-\gamma,1-\delta^{\prime})+\frac{1}{\delta^{\prime}-1}t^{\delta^{\prime}-1}).

    The last term 1δ1tδ1\frac{1}{\delta^{\prime}-1}t^{\delta^{\prime}-1} does not appear when δ<1\delta^{\prime}<1. The leading term of the left-hand side is c/(1γ)t1γδc/(1-\gamma)t^{1-\gamma-\delta}. The leading term of the right-hand side is cB(1γ,1δ)t1γδcB(1-\gamma,1-\delta)t^{1-\gamma-\delta} as 1γδ>1γδ1-\gamma-\delta>1-\gamma-\delta^{\prime} and 1γδ>γ1-\gamma-\delta>-\gamma. By equating the coefficients of the leading terms, we obtain an identity

    11γ=B(1γ,1δ).\frac{1}{1-\gamma}=B(1-\gamma,1-\delta).

    As B(1γ,1)=1/(1γ)B(1-\gamma,1)=1/(1-\gamma), we obtain the result δ=0\delta=0. The subleading term of the left-hand side is O(t0)O(t^{0}) as δ=0\delta=0, we have 1γδ=01-\gamma-\delta^{\prime}=0. We obtain δ=1γ\delta^{\prime}=1-\gamma. If one assume δ>1\delta>1, we cannot match O(t0)O(t^{0}) terms of both sides of (15). We can exclude the possibility δ,δ>1\delta,\delta^{\prime}>1 for γ<1\gamma<1.

  2. 2.

    γ=1\gamma=1 case.

    Dγ(t)=lntD_{\gamma}(t)=\ln t and we assume δ<1\delta<1. (15) becomes,

    (θ+lnt)(ctδ+ctδ)=t1+ctδ(Breg(0,1δ)+lnt)+ctδ(Breg(0,1δ)+lnt+1δ1tδ1).(\theta+\ln t)(ct^{-\delta}+c^{\prime}t^{-\delta^{\prime}})=t^{-1}+ct^{-\delta}(B_{reg}(0,1-\delta)+\ln t)+c^{\prime}t^{-\delta^{\prime}}(B_{reg}(0,1-\delta^{\prime})+\ln t+\frac{1}{\delta^{\prime}-1}t^{\delta^{\prime}-1}).

    The leading term is O(lnttδ)O(\ln t\cdot t^{-\delta}) and we obtain a trivial identity c=cc=c. The subleading term is O(tδ)O(t^{-\delta}) and we obtain the identity

    θ=Breg(0,1δ).\theta=B_{reg}(0,1-\delta). (16)

    By solving the identity, we estimate δc\delta_{c}, the critical exponent of the ACF, C(t)tδcC(t)\propto t^{-\delta_{c}}. The estimation of δ\delta^{\prime} is obscure, as t The same identity with δ\delta holds for δ\delta^{\prime} by comparing the O(tδ)O(t^{-\delta^{\prime}}) terms. It suggests that δ=δ\delta^{\prime}=\delta at γ=1\gamma=1.

  3. 3.

    γ>1\gamma>1 case.

    Dγ(t)=1/(γ1)D_{\gamma}(t)=1/(\gamma-1) and we assume 1<δ<δ1<\delta<\delta^{\prime}. (15) becomes,

    (θ+1/(γ1))(ctδ+ctδ)\displaystyle(\theta+1/(\gamma-1))(ct^{-\delta}+c^{\prime}t^{-\delta^{\prime}}) =\displaystyle= t1+ct1γδ(Breg(1γ,1δ)+1γ1tγ1+1δ1tδ1)\displaystyle t^{-1}+ct^{1-\gamma-\delta}(B_{reg}(1-\gamma,1-\delta)+\frac{1}{\gamma-1}t^{\gamma-1}+\frac{1}{\delta-1}t^{\delta-1})
    +\displaystyle+ ct1γδ(Breg(1γ,1δ)+1γ1tγ1+1δ1tδ1).\displaystyle c^{\prime}t^{1-\gamma-\delta^{\prime}}(B_{reg}(1-\gamma,1-\delta^{\prime})+\frac{1}{\gamma-1}t^{\gamma-1}+\frac{1}{\delta^{\prime}-1}t^{\delta^{\prime}-1}).

    The leading term is O(tδ)O(t^{-\delta}) and we obtain δ=γ\delta=\gamma. The subleading term is O(tδ)O(t^{-\delta^{\prime}}) and we obtain δ=2γ1\delta^{\prime}=2\gamma-1.

The power-law exponents of the leading and subleading terms of the ACF C(t)C(t) are summarized as

(δ,δ)={(γ,2γ1)γ>1(δc,δc)γ=1(0,1γ)γ<1\displaystyle(\delta,\delta^{\prime})=\left\{\begin{array}[]{cc}(\gamma,2\gamma-1)&\gamma>1\\ (\delta_{c},\delta_{c})&\gamma=1\\ (0,1-\gamma)&\gamma<1\end{array}\right. (20)

We denote the values by (δth,δth)(\delta_{th},\delta_{th}^{\prime}) to distinguish them from the numerically estimated values, which are denoted as (δnum,δnum)(\delta_{num},\delta_{num}^{\prime}).

We solve the recursive relation of (3) numerically and estimate C(t)C(t) for t28×104t\leq 2^{8}\times 10^{4}. We set (α,β)=(1,1)(\alpha,\beta)=(1,1) and γ[0.0,0.1,0,2,,2.0]\gamma\in[0.0,0.1,0,2,\cdots,2.0]. For large t>>1t>>1, the leading term of C(t)C(t) is ctδct^{-\delta}, and we have the relation for δ\delta,

δ=limtln2C(t)/C(2t).\delta=\lim_{t\to\infty}\ln_{2}C(t)/C(2t). (21)

We estimate δnum\delta_{num} by the formula with t=27×104t=2^{7}\times 10^{4}. About δnum\delta_{num}^{\prime}, we use the relation

δ=limtln2C(t)C(2t)2δC(2t)C(4t)2δ.\delta^{\prime}=\lim_{t\to\infty}\ln_{2}\frac{C(t)-C(2t)2^{\delta}}{C(2t)-C(4t)2^{\delta}}. (22)

The relation is based on the relation for the subleading term of C(t)C(t), C(t)C(2t)2δctδ(12δδ)C(t)-C(2t)2^{\delta}\simeq c^{\prime}t^{-\delta^{\prime}}(1-2^{\delta-\delta^{\prime}}) for t>>1t>>1. We estimate δnum\delta_{num}^{\prime} using the formula t=26×104t=2^{6}\times 10^{4}.

Refer to caption
Figure 1: Plot of (δnum,δnum)(\delta_{num},\delta_{num}^{\prime}) vs. γ\gamma. We estimate the power-law exponents using (21) and (22). (δth,δth)(\delta_{th},\delta_{th}^{\prime}) are the theoretically predicted values in (20).

Figure 1 shows (δnum,δnum)(\delta_{num},\delta_{num}^{\prime}) vs. γ\gamma. For reference, we also plot (δth,δth)(\delta_{th},\delta_{th}^{\prime}). As can be clearly observed, if γ\gamma is far from the critical value γc=1\gamma_{c}=1, the numerically estimated values agree with the theoretical results. However, near the critical point, the discrepancy becomes large. At the critical point, both δnum,δnum\delta_{num},\delta_{num}^{\prime} coincide with δc\delta_{c}.

V Conclusion

We study the Pólya urn with an arbitrary memory kernel. Previously, we have shown that the process shows a phase transition for the power-law decay memory kernelHisakado and Mori (2020). We formulate the problem using the method of stochastic approximation. It is necessary to introduce an auxiliary field {Z(r,t),0<r<}\{Z(r,t),0<r<\infty\} to reduce the process Markovian. The SDE for the auxiliary field are coupled as the field is driven by a common Wiener process. We obtain the formal solution for the initial value problem and derive the self-consistent equation for the ACF in the framework. We estimate the power-law exponents for the leading and subleading terms of the ACF. The results for the power-law exponents are new except for δc\delta_{c} at the critical point.

The essential difference from the continuous phase transitions of the nonlinear Pólya urn is the lack of the universal function. The transition of nonlinear Pólya urns has the correlation length scale ξ\xi for the decay of the ACF and the asymptotic behavior of the ACF obeys a scaling relation. For the existence of the scaling relation for ACF, the continuity of the power-law exponent and the logarithmic behavior of the ACF at the critical point is crucialMori and Hisakado (2015a); Nakayama and Mori (2021). In the phase transition of the Pólya urn with power-law decay, the power-law exponent is discontinuous at γ=1\gamma=1 and the ACF does not show logarithmic behavior at the critical point. The phase transition of the Pólya urn with power-law memory kernel is completely different from that of the nonlinear Pólya urn. The nature of the former is the phase transition of the stationary-nonstationary phase transition. The nature of the latter is the absorption-state phase transition where there exists the scaling relation for the order parameter.

This study also shows the importance of the correlation of the Wiener process. The auxiliary field {Z(r,t),0<r<}\{Z(r,t),0<r<\infty\}, which is introduced for the Pólya urn with power-law decay memory kernel, shares the same Wiener process WtW_{t} and it induces the coupling among the modes of the field. If {Z(r,t),0<r<}\{Z(r,t),0<r<\infty\} are driven by independent Wiener processes {W(r,t),0<r<}\{W(r,t),0<r<\infty\}, they obey the Ornstein-Uhlenbeck process for the exponential decay memory kernel independently from each other. They are mean-reverting and converge to the same equilibrium α/θ\alpha/\theta. Even though Z(r,t)Z(r,t) with small rr should have long relaxation time, Zγ(t)=<Z(r,t)>Z_{\gamma}(t)=<Z(r,t)> also converges to the same equilibrium. The phase transition does not occur in the case. The condition of the mean-revering for the multivariate Ornstein-Uhlenbeck process is knownVatiwutipong and Phewchean (2019). By the correlation of the Wiener process, the condition should be broken. To understand the importance of the correlation of the Wiener process for the asymptotic behavior of the process, we need to solve the initial value problem for the SDE of the coupled auxiliary field.

References

  • Mantegna and Stanley (2007) R. N. Mantegna and H. E. Stanley, Introduction to Econophysics: Correlations and Complexity in Finance (Cambridge University Press, Cambridge, 2007).
  • Galam (2008) S. Galam, Int. J. Mod. Phys. C 19, 409 (2008).
  • Castellano et al. (2009) C. Castellano, S. Fortunato,  and V. Loreto, Rev.Mod.Phys. 81, 591 (2009).
  • Arthur (1989) W. B. Arthur, Econ. Jour. 99, 116 (1989).
  • Bikhchandani et al. (1992) S. Bikhchandani, D. Hirshleifer,  and I. Welch, J. Polit. Econ. 100, 992 (1992).
  • Cont and Bouchaud (2000) R. Cont and J. Bouchaud, Macroecon. Dynam. 4, 170 (2000).
  • Mori et al. (2016) S. Mori, K. Nakayama,  and M. Hisakado, Phys.Rev. E 94, 052301 (2016).
  • Kirman (1993) A. Kirman, Q. J. Econ. 108, 137 (1993).
  • Lux (1995) T. Lux, Econ. J. 105, 881 (1995).
  • Hisakado and Mori (2020) M. Hisakado and S. Mori, Physica A 544, 123480 (2020).
  • Pluto and Tasche (2011) K. Pluto and D. Tasche, in The Basel II Risk Parameters, edited by B.Engelmann. and R. Rauhmeier (Springer, Berlin, 2011) pp. 75–101.
  • Hisakado and Mori (2021) M. Hisakado and S. Mori, Physica A 563, 125435 (2021).
  • G.Pólya (1931) G.Pólya, Ann. Inst. Henri Poincaré 1, 117 (1931).
  • Kanazawa and D.Sornette (2020a) K. Kanazawa and D.Sornette, Phys. Rev. Lett. 125, 138301 (2020a).
  • Kanazawa and D.Sornette (2020b) K. Kanazawa and D.Sornette, Phys. Rev. Research. 2, 033442 (2020b).
  • Hawkes (1971) A. Hawkes, J. R. Stat. Soc. Ser. B 33, 438 (1971).
  • Mori and Hisakado (2015a) S. Mori and M. Hisakado, Phys.Rev. E 92, 052112 (2015a).
  • Nakayama and Mori (2021) K. Nakayama and S. Mori, Phys. Rev.E 104, 014109 (2021).
  • Hinrichsen (2000) H. Hinrichsen, Adv.Phys. 49, 815 (2000).
  • Mori and Hisakado (2015b) S. Mori and M. Hisakado, J.Phys.Soc.Jpn. 84, 054001 (2015b).
  • Bo¨\ddot{o}hm (2000) W. Bo¨\ddot{o}hm, J. Appl. Prob. 37, 470 (2000).
  • Gardiner (2009) C. Gardiner, Stochastic Methods: A handbook for the Natural and Social Science, 4th ed. (Springer, Berlin, 2009).
  • Renlund (2010) H. Renlund, “Generalized pólya urns via stochastic approximation,”  (2010), arXiv:1002.3716 [math.PR] .
  • Vatiwutipong and Phewchean (2019) P. Vatiwutipong and N. Phewchean, Adv.Differ.Equ. , 276 (2019).

Appendix A SDE analysis of Pólya urn

We study the Pólya urn using the SDE for the reader’s convenience. We denote Dd(t),Sd(t),Zd(t)D_{d}(t),S_{d}(t),Z_{d}(t) for the infinite memory kernel as D(t)=tD_{\infty}(t)=t,S(t)=s=1tX(s)S_{\infty}(t)=\sum_{s=1}^{t}X(s),Z(t)=s=1tX(s)/tZ_{\infty}(t)=\sum_{s=1}^{t}X(s)/t, respectively. To estimate the expected value and the variance of Z(t+1)Z_{\infty}(t+1) with the condition Z(t)=zZ_{\infty}(t)=z, we decompose Z(t+1)Z_{\infty}(t+1) as

Z(t+1)=Z(t)+1t+1(X(t)Z(t)).Z_{\infty}(t+1)=Z_{\infty}(t)+\frac{1}{t+1}(X(t)-Z_{\infty}(t)).

We estimate the conditional expected value and conditional variance of Z(t+1)Z_{\infty}(t+1) as follows:

E(Z(t+1)|Z(t)=z)z\displaystyle E(Z_{\infty}(t+1)|Z_{\infty}(t)=z)-z =\displaystyle= αθz(D(t)+θ)D(t+1),\displaystyle\frac{\alpha-\theta z}{(D_{\infty}(t)+\theta)D_{\infty}(t+1)},
V(Z(t+1)|Z(t)=z)\displaystyle V(Z_{\infty}(t+1)|Z_{\infty}(t)=z) =\displaystyle= 1D(t+1)2α+tzθ+tβ+t(1z)θ+t.\displaystyle\frac{1}{D_{\infty}(t+1)^{2}}\cdot\frac{\alpha+tz}{\theta+t}\cdot\frac{\beta+t(1-z)}{\theta+t}.

We read the drift A(t)A_{\infty}(t) and variance B(t)B_{\infty}(t) from them and derive the SDE Gardiner (2009) as

dZ=A(αθZ)dt+BdWt.dZ_{\infty}=A_{\infty}(\alpha-\theta Z_{\infty})dt+B_{\infty}dW_{t}.

A(t),B(t)A_{\infty}(t),B_{\infty}(t) are defined as follows:

A(t)=1D(t+1)(θ+D(t)),B(t)=BD(t+1),B=(α+tz)(β+t(1z))(θ+t)2.A_{\infty}(t)=\frac{1}{D_{\infty}(t+1)(\theta+D_{\infty}(t))},B_{\infty}(t)=\frac{B}{D_{\infty}(t+1)},B=\sqrt{\frac{(\alpha+tz)(\beta+t(1-z))}{(\theta+t)^{2}}}.

As E(Z(t)|X(1)=x)=(α+x)/(θ+1)E(Z_{\infty}(t)|X(1)=x)=(\alpha+x)/(\theta+1) for the Pólya urn, we take the limit tt\to\infty by replacing zz with (α+x)/(θ+1)(\alpha+x)/(\theta+1) and approximate BB as

B=(α+x)(β+(1x))(θ+1)2.B=\sqrt{\frac{(\alpha+x)(\beta+(1-x))}{(\theta+1)^{2}}}.

The SDE is the Ornstein–Uhlenbeck type SDEGardiner (2009). The solution with the initial condition Z(1)=X(1)=xZ_{\infty}(1)=X(1)=x is

Z(t)=αθ(1eθ1tA(s)𝑑s)+xeθ1tA(s)𝑑s+eθ1tA(s)𝑑s1tB(s)eθ1sA(u)𝑑u𝑑Ws.Z_{\infty}(t)=\frac{\alpha}{\theta}(1-e^{-\theta\int_{1}^{t}A_{\infty}(s)ds})+x\cdot e^{-\theta\int_{1}^{t}A_{\infty}(s)ds}+e^{-\theta\int_{1}^{t}A_{\infty}(s)ds}\int_{1}^{t}B_{\infty}(s)e^{\theta\int_{1}^{s}A_{\infty}(u)du}dW_{s}.

We have 1tA(s)𝑑s=1θlnt(1+θ)t+θ\int_{1}^{t}A_{\infty}(s)ds=\frac{1}{\theta}\ln\frac{t(1+\theta)}{t+\theta}, and the solution is written as follows:

Z(t)=αθ(1t+θt(1+θ))+xt+θt(1+θ)+(1+θt)1tB1+sss+θ𝑑Ws.Z_{\infty}(t)=\frac{\alpha}{\theta}\left(1-\frac{t+\theta}{t(1+\theta)}\right)+x\cdot\frac{t+\theta}{t(1+\theta)}+\left(1+\frac{\theta}{t}\right)\int_{1}^{t}\frac{B}{1+s}\cdot\frac{s}{s+\theta}dW_{s}.

The conditional expected value of Z(t)Z_{\infty}(t) for X(1)=xX(1)=x is

E(Z(t)|X(1)=x)=αθ(1t+θt(1+θ))+xt+θt(1+θ)=t>>1αθθ1+θ+x11+θ.E(Z_{\infty}(t)|X(1)=x)=\frac{\alpha}{\theta}(1-\frac{t+\theta}{t(1+\theta)})+x\cdot\frac{t+\theta}{t(1+\theta)}\stackrel{{\scriptstyle t>>1}}{{=}}\frac{\alpha}{\theta}\cdot\frac{\theta}{1+\theta}+x\cdot\frac{1}{1+\theta}.

The ACF is then estimated as

C(t)\displaystyle C(t) =\displaystyle= E(X(t+1)|X(1)=1)E(X(t+1)|X(1)=0)\displaystyle E(X(t+1)|X(1)=1)-E(X(t+1)|X(1)=0)
=\displaystyle= E(P(X(t+1)=1|X(1)=1)E(P(X(t+1)=1)|X(1)=0)\displaystyle E(P(X(t+1)=1|X(1)=1)-E(P(X(t+1)=1)|X(1)=0)
=\displaystyle= D(t)(E(Z(t)|X(1)=1)E(Z(t)|X(1)=0))D(t)+θ=1θ+1=C(1).\displaystyle\frac{D_{\infty}(t)(E(Z_{\infty}(t)|X(1)=1)-E(Z_{\infty}(t)|X(1)=0))}{D_{\infty}(t)+\theta}=\frac{1}{\theta+1}=C(1).

The result agrees with (4). We estimate the conditional variance of Z(t)Z_{\infty}(t) for X(1)=xX(1)=x as

V(Z(t)|X(1)=x)=B2(1+θt)2t1(t+θ)(1+θ)B21+θ=1θ+1(α+x)(β+(1x)(θ+1)2.V(Z_{\infty}(t)|X(1)=x)=B^{2}\left(1+\frac{\theta}{t}\right)^{2}\cdot\frac{t-1}{(t+\theta)(1+\theta)}\simeq\frac{B^{2}}{1+\theta}=\frac{1}{\theta+1}\cdot\frac{(\alpha+x)(\beta+(1-x)}{(\theta+1)^{2}}.

Apart from the denominator of the prefactor, it agrees with the exact result of (5).