This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Generalized Bernoulli Process and Fractional Binomial Distribution

Jeonghwa Leelabel=a][email protected] [ Department of Mathematics and Statistics, University of North Carolina Wilmington, USApresep=, ]a
Abstract

Recently, a generalized Bernoulli process (GBP) was developed as a stationary binary sequence whose covariance function obeys a power law. In this paper, we further develop generalized Bernoulli processes, reveal their asymptotic behaviors, and find applications. We show that a GBP can have the same scaling limit as the fractional Poisson process. Considering that the Poisson process approximates the Bernoulli process under certain conditions, the connection we found between GBP and the fractional Poisson process is thought of as its counterpart under long-range dependence. When applied to indicator data, a GBP outperforms a Markov chain in the presence of long-range dependence. Fractional binomial models are defined as the sum in GBPs, and it is shown that when applied to count data with excess zeros, a fractional binomial model outperforms zero-inflated models that are used extensively for such data in current literature.

Generalized Bernoulli process,
Long-range dependence,
Fractional Poisson process,
Count data with excess zeros,
keywords:
\startlocaldefs\endlocaldefs

1 Introduction

The binomial and Poisson models are used for counting a number of events. Because of their common key assumptions on the memoryless property and constant rate of events, the Poisson distribution can approximate the binomial distribution. In practice, the assumptions often fail, resulting in overdispersion/excess zeros in count data. There have been numerous models developed in the past for such data, among them are zero-inflated models and generalized linear-type models Skellam (1948); Altham (1978); Kadane (2016); Wedderburn (1974); Consul (1989); Lambert (1992); Conway and Maxwell (1962).

Some models were developed through a correlated structure between events. In Rodrigues et al. (2013), the Markov-correlated Poisson process and Markov-dependent binomial distribution were developed through Markov-dependent Bernoulli trials. In Borges et al. (2012), a new class of correlated Poisson process and correlated weighted Poisson process was proposed with equicorrelated uniform random variables.

In Lee (2021b), a generalized Bernoulli process (GBP) was defined as a stationary binary sequence whose covariance function decreases slowly with a power law. This is called long-range dependence (LRD), also called long-memory property, and a stochastic process with LRD shows different large-scale behavior than the “usual” stationary process whose covariance function decays exponentially fast. LRD has been observed in many fields such as hydrology, econometrics, earth science, etc, and there have been several approaches to define LRD. For more information on LRD, see Samorodnitsky (2018). The estimation method for parameters in GBP and its application to earthquake data can be found in Lee (2021a),

In 2003, Nick Laskin discovered the fractional Poisson process by the fractional Kolmogorov-Feller equation that governs the non-Markov dependence structure. In the fractional Poisson process, events no longer independently occur, and inter-arrival time follows the first family of Mittag-Leffler distribution which is heavy-tail distribution Laskin (2003). Long-memory property of the fractional Poisson process was investigated in Biard and Saussereau (2014).

In this paper, we further develop generalized Bernoulli processes, investigate their asymptotic properties, and find applications. It turns out that there is an interesting connection between a GBP and the fractional Poisson process. Both can have the same scaling limit to the second family of Mittag-Leffler distribution, and in both GBP and fractional Poisson process, the interarrival time follows a heavy-tail distribution. Therefore, the large-scale behavior of the GBP is similar to that of the fractional Poisson process, and the GBP is considered to be a discrete-time counterpart of the fractional Poisson process.

In the application of GBPs to indicator data in economics, it is shown that a GBP outperforms a Markov-dependent Bernoulli process when LRD is present in the data. Fractional binomial models are defined as the sum of the first nn variables in the GBPs, and it is shown that when applied to count data with over-dispersion/excess zeros, a fractional binomial model outperforms zero-inflated models.

In Section 2, we will review the GBP proposed in Lee (2021b), and define a new generalized Bernoulli process. In Section 3, we will compare the GBPs and the fractional Poisson process by examining their asymptotic properties. In Section 4, we will continue to compare these distributions through simulations. Section 5 shows the applications of the GBPs and fractional binomial models to real data. All the proofs and technical results can be found in Supplementary material. Throughout this paper, we assume that i,i0,i2,,i,i0,i2,,i,i_{0},i_{2},\cdots\in\mathbb{N},i^{\prime},i_{0}^{\prime},i_{2}^{\prime},\cdots\in\mathbb{N}, i0<i1<i2<,i_{0}<i_{1}<i_{2}<\cdots, and i0<i1<i2<,i_{0}^{\prime}<i_{1}^{\prime}<i_{2}^{\prime}<\cdots, unless mentioned otherwise. For any set A,A, |A||A| is the number of elements in AA, with ||=0.|\emptyset|=0. anbna_{n}\sim b_{n} means an/bn1a_{n}/b_{n}\to 1 as n.n\to\infty. For notational convenience, we denote P((iAi)(jBj))P((\cap_{i}A_{i})\cap(\cap_{j}B_{j})) by P(iAijBj).P(\cap_{i}A_{i}\cap_{j}B_{j}).

2 Generalized Bernoulli processes and fractional binomial distributions

2.1 Generalized Bernoulli process I and fractional binomial distribution I

We first review the generalized Bernoulli process {Xi,i}\{X_{i},i\in\mathbb{N}\} developed in Lee (2021b). We will call it the generalized Bernoulli process I (GBP-I) to differentiate it from what we will develop in this paper. The GBP-I, {Xi,i}\{X_{i},i\in\mathbb{N}\}, was defined with parameters, p,H,cp,H,c that satisfy the following assumption.

Assumption 1.

p,H(0,1),p,H\in(0,1), and

0c<min{1p,12(2p+22H2+4pp22H+24H4)}.0\leq c<\min\{1-p,\frac{1}{2}(-2p+2^{2H-2}+\sqrt{4p-p2^{2H}+2^{4H-4}})\}.

Under Assumption 1, the GBP-I is well defined with the following probabilities.

P(Xi=1)=p,P(Xi=0)=1p,\displaystyle P(X_{i}=1)=p,P(X_{i}=0)=1-p,
P(Xi0=1,Xi1=1,,Xin=1)=pj=1n(p+c|ijij1|2H2),P(X_{i_{0}}=1,X_{i_{1}}=1,\cdots,X_{i_{n}}=1)=p\prod_{j=1}^{n}(p+c|i_{j}-i_{j-1}|^{2H-2}),

and for any disjoint sets A,B,A,B\subset\mathbb{N},

P(iB{Xi=0}iA{Xi=1})=k=0|B|BB|B|=k(1)kP(iBA{Xi=1}),\displaystyle P(\cap_{i^{\prime}\in B}\{X_{i^{\prime}}=0\}\cap_{i\in A}\{X_{i}=1\})=\sum_{k=0}^{|B|}\sum_{\begin{subarray}{c}B^{\prime}\subset B\\ |B^{\prime}|=k\end{subarray}}(-1)^{k}P(\cap_{i\in B^{\prime}\cup A}\{X_{i}=1\}), (1)

and

P(iB{Xi=0})\displaystyle P(\cap_{i^{\prime}\in B}\{X_{i^{\prime}}=0\}) =1+k=1|B|BB|B|=k(1)kP(iB{Xi=1}).\displaystyle=1+\sum_{k=1}^{|B|}\sum_{\begin{subarray}{c}B^{\prime}\subset B\\ |B^{\prime}|=k\end{subarray}}(-1)^{k}P(\cap_{i\in B^{\prime}}\{X_{i}=1\}). (2)

The following operators were defined in Lee (2021b) to express the above probabilities more conveniently.

Definition 1.

Define the following operation on a set A={i0,i1,,in}N{0}A=\{i_{0},i_{1},\cdots,i_{n}\}\subset N\cup\{0\} with i0<i1<<in.i_{0}<i_{1}<\cdots<i_{n}.

LH(A)=j=1,,n(p+c|ijij1|2H2).L_{H}(A)=\prod_{j=1,\cdots,n}\Big{(}p+c|i_{j}-i_{j-1}|^{2H-2}\Big{)}.

If A=A=\emptyset, define LH(A):=1/pL_{H}(A):=1/p, and if |A|=1,LH(A):=1.|A|=1,L_{H}(A):=1.

Definition 2.

Define for disjoint sets, A,B{0}A,B\subset\mathbb{N}\cup\{0\} with |B|=m>0,|B|=m>0,

DH(A,B)=j=0|B|BB|B|=j(1)jLH(AB).\displaystyle D_{H}(A,B)=\sum_{j=0}^{|B|}\sum_{\begin{subarray}{c}B^{\prime}\subset B\\ |B^{\prime}|=j\end{subarray}}(-1)^{j}L_{H}(A\cup B^{\prime}).

If B=,DH(A,B):=LH(A).B=\emptyset,D_{H}(A,B):=L_{H}(A).

Using the operators, the probabilities in the GBP-I can be expressed as

P(iA{Xi=1}iB{Xi=0})\displaystyle P(\cap_{i\in A}\{X_{i}=1\}\cap_{i^{\prime}\in B}\{X_{i^{\prime}}=0\}) =pDH(A,B)\displaystyle=pD_{H}(A,B)

for disjoint sets A,B.A,B\subset\mathbb{N}. The GBP-I is a stationary process with covariance function

Cov(Xi,Xj)=pc|ij|2H2,ij.Cov(X_{i},X_{j})=pc|i-j|^{2H-2},i\neq j.

When H(.5,1),H\in(.5,1), the GBP-I possesses LRD since i=1Cov(X1,Xi)=.\sum_{i=1}^{\infty}Cov(X_{1},X_{i})=\infty. The sum of the first nn variables in the GBP-I was defined as the fractional binomial random variable whose mean is npnp, and if H(.5,1),H\in(.5,1), its variance is asymptotically proportional to n2H.n^{2H}. We will call it fractional binomial distribution I, and denote it by Bn(p,H,c)B_{n}(p,H,c) or simply BnB_{n}. When c=0,c=0, Bn(p,H,0)B_{n}(p,H,0) becomes the regular binomial random variable whose parameters are n,p.n,p.

2.2 Generalized Bernoulli process II and fractional binomial distribution II

We define the generalized Bernoulli process II (GBP-II), {Xi(n),i=1,2,,n}\{X_{i}^{(n)},i=1,2,\cdots,n\}, nn\in\mathbb{N}, that has three parameters H,c,λ,H,c,\lambda, and the fractional binomial random variable II, Bn(H,c,λ)=i=1nXi(n).B_{n}^{\circ}(H,c,\lambda)=\sum_{i=1}^{n}X_{i}^{(n)}. To ease our notation, XiX_{i} and Xi(n)X_{i}^{(n)} are used interchangeably, and BnB_{n}^{\circ} will replace Bn(H,c,λ)B_{n}^{\circ}(H,c,\lambda) when there is no confusion. For the GBP-II, it is assumed that the parameters satisfy the following condition.

Assumption 2.

H(.5,1),c(0,22H2),H\in(.5,1),c\in(0,2^{2H-2}), and λ(0,c).\lambda\in(0,c).

We will show that the GBP-II is well-defined under Assumption 2 with the following probabilities. For n,n\in\mathbb{N}, {Xi(n),i=1,2,,n}\{X_{i}^{(n)},i=1,2,\cdots,n\} is defined with P(Xi=1)=pn,P(Xi=0)=1pnP(X_{i}=1)=p_{n},P(X_{i}=0)=1-p_{n} for i=1,2,,n,i=1,2,\cdots,n, and for any 1i0<i1<i2<<ikn,1\leq i_{0}<i_{1}<i_{2}<\cdots<i_{k}\leq n,

P(Xi0=1,Xi1=1Xik=1)=pnck|(i1i0)(i2i1)(ikik1)|2H2P(X_{i_{0}}=1,X_{i_{1}}=1\cdots X_{i_{k}}=1)=p_{n}c^{k}|(i_{1}-i_{0})\cdot(i_{2}-i_{1})\cdots(i_{k}-i_{k-1})|^{2H-2} (3)

where pn=λn2H2.p_{n}=\lambda n^{2H-2}.

Definition 3.

Define the following operation on a set A={i0,i1,,ik}{0},i0<i1<<ik.A=\{i_{0},i_{1},\cdots,i_{k}\}\subset\mathbb{N}\cup\{0\},i_{0}<i_{1}<\cdots<i_{k}.

LH(A)=j=1,,k|ijij1|2H2.L_{H}^{\circ}(A)=\prod_{j=1,\cdots,k}|i_{j}-i_{j-1}|^{2H-2}.

If |A|=1|A|=1, then we define LH(A):=1.L_{H}^{\circ}(A):=1. If |A|=0|A|=0 (when A=A=\emptyset), then LH(A):=c/pn.L_{H}^{\circ}(A):=c/p_{n}. For example, if A={1,2,4,7},LH(A)=|(21)(42)(74)|2H2.A=\{1,2,4,7\},L_{H}^{\circ}(A)=|(2-1)(4-2)(7-4)|^{2H-2}.

Definition 4.

Define for disjoint sets A,BA,B such that A,B{0}A,B\subset\mathbb{N}\cup\{0\} and B,B\neq\emptyset,

DH(A,B)=j=0|B|BB|B|=j(1)jcjLH(AB).\displaystyle D_{H}^{\circ}(A,B)=\sum_{j=0}^{|B|}\sum_{\begin{subarray}{c}B^{\prime}\subset B\\ |B^{\prime}|=j\end{subarray}}(-1)^{j}c^{j}L_{H}^{\circ}(A\cup B^{\prime}).

If B=,B=\emptyset, then DH(A,B):=LH(A).D_{H}^{\circ}(A,B):=L_{H}^{\circ}(A).

The joint probabilities in GBP-II is defined with (3) and (1-2) for any disjoint sets A,B{1,2,,n}A,B\subset\{1,2,\cdots,n\}, i.e., by the inclusion-exclusion principle. This can be succinctly written as

P(iA{Xi=1}iB{Xi=0})\displaystyle P(\cap_{i\in A}\{X_{i}=1\}\cap_{i^{\prime}\in B}\{X_{i^{\prime}}=0\}) =pnc|A|1DH(A,B).\displaystyle=p_{n}c^{|A|-1}D_{H}^{\circ}(A,B). (4)

To show that the GBP-II is well-defined, we have to verify that (4)0(4)\geq 0 for any disjoint sets A,B{1,2,,n},A,B\subset\{1,2,\cdots,n\}, for any n.n\in\mathbb{N}.

Proposition 2.1.

Under Assumption 2, for any nn\in\mathbb{N} and any disjoint sets A,B{1,2,,n},A,B\subset\{1,2,\cdots,n\},

pnc|A|1DH(A,B)>0.p_{n}c^{|A|-1}D_{H}^{\circ}(A,B)>0.

From the result of Proposition 2.1, the GBP-II is well defined stationary binary sequence whose correlation function obeys a power law asymptotically.

Theorem 2.2.

For any n,n\in\mathbb{N}, {Xi(n),i=1,2,,n}\{X_{i}^{(n)},i=1,2,\cdots,n\} defined with (4) under Assumption 2 is a stationary process with P(Xi=1)=pn,P(Xi=0)=1pn,P(X_{i}=1)=p_{n},P(X_{i}=0)=1-p_{n}, and
i)

Cov(Xi(n),Xj(n))=pnc|ij|2H2pn2,Cov(X_{i}^{(n)},X_{j}^{(n)})=p_{n}c|i-j|^{2H-2}-p_{n}^{2},

for ij,i,j=1,2,,n.i\neq j,i,j=1,2,\cdots,n.
ii) For any i,j,i,j\in\mathbb{N},

limnCorr(Xi(n),Xj(n))=c|ij|2H2.\displaystyle\lim_{n\to\infty}Corr(X_{i}^{(n)},X_{j}^{(n)})=c|i-j|^{2H-2}.

iii) Define Bn=i=1nXi,B_{n}^{\circ}=\sum_{i=1}^{n}X_{i}, then E(Bn)=λn2H1E(B_{n}^{\circ})=\lambda n^{2H-1}, and as n,n\to\infty,

E((Bn)2)\displaystyle E((B_{n}^{\circ})^{2}) λc(2H1)Hn4H2,\displaystyle\sim\frac{\lambda c}{(2H-1)H}n^{4H-2},
Var(Bn)\displaystyle Var(B_{n}^{\circ}) n4H2(λc(2H1)Hλ2).\displaystyle\sim n^{4H-2}\Bigg{(}\frac{\lambda c}{(2H-1)H}-\lambda^{2}\Bigg{)}.

We call the stationary process {Xi(n),i=1,2,,n}\{X_{i}^{(n)},i=1,2,\cdots,n\} defined in Theorem 2.2 the generalized Bernoulli process II (GBP-II). Also, the sum of the first nn variables in the GBP-II, BnB_{n}^{\circ}, is called the fractional binomial random variable II.

Remark 1.

If we use a correlation function and its asymptotic behavior to define long-range dependence (LRD), then we conclude that the GBP-II always possesses a long-memory property, since

i=1ncorr(X1(n),Xi(n))(c/(2H1)λ)n2H1,\sum_{i=1}^{n}corr(X_{1}^{(n)},X_{i}^{(n)})\sim(c/(2H-1)-\lambda)n^{2H-1},

therefore,

limni=1ncorr(X1(n),Xi(n))=.\lim_{n\to\infty}\sum_{i=1}^{n}corr(X_{1}^{(n)},X_{i}^{(n)})=\infty.

Alternatively, one can define LRD in the GBP-II using the asymptotic behavior of its covariance function. Since i=1nCov(X1(n),Xi(n))(cλ/(2H1)λ2)n4H3,\sum_{i=1}^{n}Cov(X_{1}^{(n)},X_{i}^{(n)})\sim(c\lambda/(2H-1)-\lambda^{2})n^{4H-3}, if H(3/4,1),H\in(3/4,1), then

limni=1nCov(X1(n),Xi(n))=,\lim_{n\to\infty}\sum_{i=1}^{n}Cov(X_{1}^{(n)},X_{i}^{(n)})=\infty,

and the GBP-II is considered to have LRD when H(3/4,1)H\in(3/4,1).

3 GBP and fractional Poisson process

3.1 Comparison between the GBP-II and the fractional Poisson process

We will show the connection between GBP-II and the fractional Poisson process by using the moment generating function (mgf). First, we modify the GBP-II, and define {Xi,i=1,2,}\{X_{i}^{*},i=1,2,\cdots\} such that

P(Xi=1)=ci2H2=cLH({0,i}),\displaystyle P(X_{i}^{*}=1)=ci^{2H-2}=cL_{H}^{\circ}(\{0,i\}),
P(Xi1=1,Xi2=1Xik=1)\displaystyle P(X_{i_{1}}^{*}=1,X_{i_{2}}^{*}=1\cdots X_{i_{k}}^{*}=1) =ck|i1(i2i1)(i3i2)(ikik1)|2H2\displaystyle=c^{k}|i_{1}(i_{2}-i_{1})(i_{3}-i_{2})\cdots(i_{k}-i_{k-1})|^{2H-2}
=ckLH({0,i1,,ik}),\displaystyle=c^{k}L_{H}^{\circ}(\{0,i_{1},\cdots,i_{k}\}),

and in general, for A,B,AB=,A,B\subset\mathbb{N},A\cap B=\emptyset,

P(iA{Xi=1}iB{Xi=0})\displaystyle P(\cap_{i\in A}\{X_{i}^{*}=1\}\cap_{i^{\prime}\in B}\{X_{i^{\prime}}^{*}=0\}) =c|A|DH(A{0},B).\displaystyle=c^{|A|}D_{H}^{\circ}(A\cup\{0\},B). (5)

Note that DH(A{0},B)=DH(A(1){1},B(1))D_{H}^{\circ}(A\cup\{0\},B)=D_{H}^{\circ}(A^{(1)}\cup\{1\},B^{(1)}) where A(j)={i+j:iA}A^{(j)}=\{i+j:i\in A\} and B(j)={i+j:iB}.B^{(j)}=\{i+j:i\in B\}. Therefore, by Proposition 2.1, (5)>0(5)>0, and {Xi,i=1,2,}\{X_{i}^{*},i=1,2,\cdots\} is well defined, which we will call GBP-II.

Roughly speaking, {Xi,i=1,2,}\{X_{i}^{*},i=1,2,\cdots\} can be considered as what is observed after the first 1 in the GBP-II for large nn. In fact, the probability distribution (5) is the limiting distribution of the conditional probability of the GBP-II that is observed after the first observation of ”1”. If we define a random variable T0T_{0} as the time when the first ”1” appears in the GBP-II, then for any t,i,t,i\in\mathbb{N},

limnP(Xi+t(n)=1|T0=t)=ci2H2,\lim_{n\to\infty}{P(X_{i+t}^{(n)}=1|T_{0}=t)}=ci^{2H-2},

and for A,B,AB=,A,B\subset\mathbb{N},A\cap B=\emptyset,

limnP(iA(t){Xi(n)=1}iB(t){Xi(n)=0}|T0=t)\displaystyle\lim_{n\to\infty}P(\cap_{i\in A^{(t)}}\{X_{i}^{(n)}=1\}\cap_{i^{\prime}\in B^{(t)}}\{X_{i^{\prime}}^{(n)}=0\}|T_{0}=t) =c|A|DH(A(t){t},B(t))\displaystyle=c^{|A|}D_{H}^{\circ}(A^{(t)}\cup\{t\},B^{(t)})
=c|A|DH(A{0},B),\displaystyle=c^{|A|}D_{H}^{\circ}(A\cup\{0\},B),

which is the same as the distribution of {Xi,i}\{X_{i}^{*},i\in\mathbb{N}\} defined in (5).

Define Bn,(H,c)B_{n}^{\circ,{*}}(H,c) as the sum of the first nn variables in the GBP-II,

Bn,=i=1nXi,B_{n}^{\circ,{*}}=\sum_{i=1}^{n}X_{i}^{*},

and call it the fractional binomial random variable II.{}^{*}. In both of the fractional binomial distributions-II, II, the moments are asymptotically proportional to the powers of n2H1.n^{2H-1}.

Theorem 3.1.

For any k,k\in\mathbb{N}, as n,n\to\infty,
i) the fractional binomial II has

E((Bn)k)ckn(2H1)k,E(({B_{n}^{\circ}})^{k})\sim c_{k}n^{(2H-1)k},

where ck=k!λ(cΓ(2H1))k1Γ((2H1)(k1)+2),c_{k}=\frac{k!\lambda(c\Gamma(2H-1))^{k-1}}{\Gamma((2H-1)(k-1)+2)},
ii) the fractional binomial II has

E((Bn,)k)ckn(2H1)k,E(({B_{n}^{\circ,{*}}})^{k})\sim c_{k}^{*}n^{(2H-1)k},

where ck=k!(cΓ(2H1))kΓ((2H1)k+1).c_{k}^{*}=\frac{k!(c\Gamma(2H-1))^{k}}{\Gamma((2H-1)k+1)}.

It turns out that a scaled fractional binomial II has the same limiting distribution as a scaled fractional Poisson distribution. More specifically, the scaled fractional binomial II and scaled fractional Poisson, Bn,(H,c)/n2H1B_{n}^{\circ,{*}}(H,c)/n^{2H-1} and N2H1,cΓ(2H1)(n)/n2H1,N_{2H-1,c\Gamma(2H-1)}(n)/n^{2H-1}, converge in distribution to the second family of Mittag-Leffler random variable of order 2H1,2H-1, as n.n\to\infty.

Theorem 3.2.

For the fractional binomial distribution II with parameters H,cH,c that satisfy Assumption 2, and the fractional Poisson process with parameters μ=2H1,ν=cΓ(2H1),\mu=2H-1,\nu=c\Gamma(2H-1), the following holds: Both Bn,/nμB_{n}^{\circ,{*}}/n^{\mu} and Nμ,ν(n)/nμN_{\mu,\nu}(n)/n^{\mu} converge in distribution, as n,n\to\infty, to the second family of Mittag-Leffler random variable of order μ\mu, XμX_{\mu}, whose mgf is Mittag-Leffler function, i.e.,

E(etXμ)=Eμ(νt) for t,E(e^{tX_{\mu}})=E_{\mu}(\nu t)\text{ for }t\in\mathbb{R},

where

Eμ(z)=k=0zkΓ(μk+1).E_{\mu}(z)=\sum_{k=0}^{\infty}\frac{z^{k}}{\Gamma(\mu k+1)}.

3.2 Comparison between the GBP-I and the GBP-II

We will compare the asymptotic properties of GBPs through asymptotic moments and tail behavior of return time.

First, we define GBP-I in a similar way that we defined the GBP-II. GBP-I, {Xi,i=1,2,}\{X_{i}^{*},i=1,2,\cdots\}, is what is observed after the first “1” in the GBP-I. Then it has

P(Xi=1)=P(Xi+t=1|T0=t)=p+ci2H2,P(X_{i}^{*}=1)=P(X_{i+t}=1|{T_{0}}=t)=p+ci^{2H-2},
P(iAXi=1iBXi=1)\displaystyle P(\cap_{i\in A}X_{i}^{*}=1\cap_{i^{\prime}\in B}X_{i^{\prime}}^{*}=1) =P(iAXi+t=1iBXi+t=1|T0=t)\displaystyle=P(\cap_{i\in A}X_{i+t}=1\cap_{i^{\prime}\in B}X_{i^{\prime}+t}=1|{T_{0}}=t)
=DH(A{0},B),\displaystyle=D_{H}(A\cup\{0\},B),

for A,B,AB=,A,B\in\mathbb{N},A\cap B=\emptyset, where T0T_{0} is the first time when “1” appeared in the GBP-I. We also define Bn(p,H,c)B_{n}^{*}(p,H,c) as the sum of the first nn variables in the GBP-I, and call it the fractional binomial random variable I.

Theorem 3.3.

For the fractional binomial I, Bn,B_{n}, and the fractional binomial I,{}^{*}, Bn,B_{n}^{*}, we have the following asymptotic properties as n.n\to\infty.
i) For the fractional binomial I, the central moments are

E((Bnnp)2)\displaystyle E((B_{n}-np)^{2}) {b2(1)n if H(0,.5),b2(2)nlnn if H=.5,b2(3)n2H if H(.5,1),\displaystyle\sim\begin{dcases}b_{2}^{(1)}n&\text{ if $H\in(0,.5),$}\\ b_{2}^{(2)}n\ln{n}&\text{ if $H=.5,$}\\ b_{2}^{(3)}n^{2H}&\text{ if $H\in(.5,1),$}\end{dcases}
E((Bnnp)k)\displaystyle E((B_{n}-np)^{k}) bkn2H2+k for k=3,4,, and H(0,1),\displaystyle\sim b_{k}n^{2H-2+k}\text{ for $k=3,4,\cdots,$ and $H\in(0,1),$ }

where b2(1)=p(1p)+2pc/(12H),b2(2)=2pc,b2(3)=pc/(H(2H1)),bk=k(k1)cp(p)k2/((2H2+k)(2H3+k)),b_{2}^{(1)}=p(1-p)+2pc/(1-2H),b_{2}^{(2)}=2pc,b_{2}^{(3)}={pc}/{(H(2H-1))},b_{k}=k(k-1)cp(-p)^{k-2}/((2H-2+k)(2H-3+k)), and k/2\lceil k/2\rceil is the largest integer smaller than or equal to k/2k/2.
ii) For the fractional binomial I,{}^{*},

E((Bnnp)2)\displaystyle E((B_{n}^{*}-np)^{2}) {b2(1)n if H(0,.5),b2(2)nlnn if H=.5,b2(3,)n2H if H(.5,1),\displaystyle\sim\begin{dcases}b_{2}^{(1)}n&\text{ if $H\in(0,.5),$}\\ b_{2}^{(2)}n\ln{n}&\text{ if $H=.5,$}\\ b_{2}^{(3,*)}n^{2H}&\text{ if $H\in(.5,1),$}\end{dcases}
E((Bnnp)k)\displaystyle E((B_{n}^{*}-np)^{k}) bkn2H2+k for k=3,4,, and H(0,1),\displaystyle\sim b_{k}^{*}n^{2H-2+k}\text{ for $k=3,4,\cdots,$ and $H\in(0,1),$}

where b2(3,)=pc/(H(2H1))pc/Hb_{2}^{(3,*)}={pc}/{(H(2H-1))}-pc/H and bk=kc(p)k1/(2H2+k)+k(k1)cp(p)k2/((2H2+k)(2H3+k)).b_{k}^{*}=kc(-p)^{k-1}/(2H-2+k)+k(k-1)cp(-p)^{k-2}/((2H-2+k)(2H-3+k)).

For the non-central moments, E(Bn)=np,E(B_{n})=np, E(Bn)np,E(B_{n}^{*})\sim np, and E(Bnk)nkpk,E(B_{n}^{k})\sim n^{k}p^{k}, E((Bn)k)nkpkE((B_{n}^{*})^{k})\sim n^{k}p^{k} for k2.k\geq 2.

Let’s consider the number of 0’s between two successive 1’s plus one as a return time, also called interarrival time, and denote the return time from the (i1)th(i-1)^{th} 1 to the ithi^{th} 1 in the GBP-I as Ti.T_{i}. In the GBP-I, return times, {Ti,i=2,3,,},\{T_{i},i=2,3,\cdots,\}, are i.i.d. with

P(Ti=k)\displaystyle P(T_{i}=k) ={DH({1,k+1},{2,3,,k}) for k/{1},LH({1,2}) for k=1.\displaystyle=\begin{cases}D_{H}(\{1,k+1\},\{2,3,\cdots,k\})&\text{ for }k\in\mathbb{N}/\{1\},\\ L_{H}(\{1,2\})&\text{ for }k=1.\end{cases}

We will drop the subscript and use TT as a random variable for return time in the GBP-I.

In the GBP-II {Xi(n)}\{X_{i}^{(n)}\}, we define a return time in the same way and denote Ti(n)T_{i}^{(n)} as the return time to the ithi^{th} 1. Note that Ti(n),i=2,3,,T_{i}^{(n)},i=2,3,\cdots, are independent but not identically distributed, since the length of the sequence {Xi(n),i=1,,n}\{X_{i}^{(n)},i=1,\cdots,n\} is fixed as nn. However, they are asymptotically i.i.d. when n,n\to\infty, with its limiting distribution

limnP(Ti(n)=k)={cDH({1,k+1},{2,3,,k}) for k/{1},cLH({1,2}) for k=1,\displaystyle\lim_{n\to\infty}P(T_{i}^{(n)}=k)=\begin{cases}cD_{H}^{\circ}(\{1,k+1\},\{2,3,\cdots,k\})&\text{ for }k\in\mathbb{N}/\{1\},\\ cL_{H}^{\circ}(\{1,2\})&\text{ for }k=1,\end{cases} (6)

which is in fact the distribution of return time TT^{\circ} in the GBP-II, if we define return time in the GBP-II, {Xi,i},\{X_{i}^{*},i\in\mathbb{N}\}, in the same way. In the GBP-II, denote TiT_{i}^{\circ} as the return time to the ithi^{th} 1, it is easily derived that return times T1,T2,T_{1}^{\circ},T_{2}^{\circ},\cdots are i.i.d. with distribution (6).

It is found that the return time of the GBP-I has a finite mean, whereas the return time of the GBP-II has an infinite mean. Also, the return time follows heavy-tail distribution in both the GBP-I and the GBP-II with

P(return time>t)ctαP(\text{return time}>t)\sim ct^{-\alpha}

for a large tt, some constant cc, and α(1,3)\alpha\in(1,3) for the GBP-I, and α(0,1)\alpha\in(0,1) for the GBP-II.

Theorem 3.4.

In the GBP-I, E(T)=1/p.E(T)=1/p. If H(0,.5),var(T)<,H\in(0,.5),var(T)<\infty, and if H[.5,1),var(T)=.H\in[.5,1),var(T)=\infty. Furthermore,

P(T>t)=t2H3L1(t),P(T>t)=t^{2H-3}L_{1}(t),

for H(0,1),H\in(0,1), where L1L_{1} is a slowly varying function that depends on the parameters H,p,cH,p,c of the GBP-I.

Theorem 3.5.

In the GBP-II, E(T)=,E(T^{\circ})=\infty, and

P(T>t)=t12HL2(t),P(T^{\circ}>t)=t^{1-2H}L_{2}(t),

for H(0.5,1),H\in(0.5,1), where L2L_{2} is a slowly varying function that depends on the parameters H,λ,cH,\lambda,c of the GBP-II.

4 Simulation

In this section, we examine and compare the shape of the fractional binomial distributions (FB) and the fractional Poisson distribution through simulations. Each histogram in Figures 1-4 was made of 3000 simulated random variables of the corresponding distribution. Figure 1 shows the histograms of the scaled FB-II, Bn,/n2H1,B_{n}^{\circ,{*}}/n^{2H-1}, and the scaled fractional Poisson random variable, Nμ,ν(n)/n2H1,N_{\mu,\nu}(n)/n^{2H-1}, for various parameters H,c,n,μ=2H1,ν=cΓ(2H1).H,c,n,\mu=2H-1,\nu=c\Gamma(2H-1). It is observed that when n=50,n=50, the histograms of the scaled FB-II and the scaled fractional Poisson random variable are fairly close to the pdf of the second family of Mittag-Leffler distribution. The approximation is better for larger n.n. This result reflects well Theorem 3.2.

Figure 2 shows the histograms of the FB-II, II and the fractional Poisson distribution for various parameters. It is seen that the histograms of FB-II and the fractional Poisson largely overlap, which is not surprising given the fact that their scaled distributions have the same limiting distribution. However, the FB-II behaves quite differently than the other two distributions as it has a high peak near 0.

The FB-I, I and the binomial distribution are compared in Figure 3. The binomial distribution is roughly symmetric and bell-shaped for each set of parameters as expected from the central limit theorem, whereas the FBs show various shapes and larger variability than the binomial distribution. Unlike the FB-I, the FB-I has a large probability near 0, a similar phenomenon observed in Figure 2 for the FB-II.

Figure 4 are the combined results of Figures 2, 3, putting together the histograms of the binomial, FB-I, II, and fractional Poisson distributions for each set of parameters. The FB- II and the fractional Poisson distributions are similar to each other, and the shape and the range of these distributions are neither close to the binomial nor the FB- I.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 1: The first row: (H,c)=(.6,.2)(H,c)=(.6,.2) with n=50n=50 (left), 1000 (right),
the second row: (H,c)=(.8,.6)(H,c)=(.8,.6) with n=50n=50 (left), 1000 (right),
the third row: (H,c)=(.9,.3)(H,c)=(.9,.3) with n=50n=50 (left), 1000 (right).
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 2: The first row: (H,c)=(.6,.2)(H,c)=(.6,.2) with n=50n=50 (left), 1000 (right),
the second row: (H,c)=(.8,.6)(H,c)=(.8,.6) with n=50n=50 (left), 1000 (right),
the third row: (H,c)=(.9,.3)(H,c)=(.9,.3) with n=50n=50 (left), 1000 (right). (λ=c/2\lambda=c/2 for all graphs.)
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 3: The first row: (p,H,c)=(.1,.6,.2)(p,H,c)=(.1,.6,.2) with n=50n=50 (left), 1000 (right),
the second row: (p,H,c)=(.1,.8,.6)(p,H,c)=(.1,.8,.6) with n=50n=50 (left), 1000 (right),
the third row: (p,H,c)=(.6,.9,.3)(p,H,c)=(.6,.9,.3) with n=50n=50 (left), 1000 (right).
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 4: The first row: (p,H,c)=(.1,.6,.2)(p,H,c)=(.1,.6,.2) with n=50n=50 (left), 1000 (right),
the second row: (p,H,c)=(.1,.8,.6)(p,H,c)=(.1,.8,.6) with n=50n=50 (left), 1000 (right),
the third row: (p,H,c)=(.6,.9,.3)(p,H,c)=(.6,.9,.3) with n=50n=50 (left), 1000 (right).

5 Applications

5.1 Application of GBP

The monthly unemployment rate (seasonally adjusted, percent) from January 1948 to December 2022 was obtained from the U.S. Bureau of Labor Statistics (https://fred.stlouisfed.org/series/UNRATE) and shown in the left graph of Figure 5. Using the decomposition method, the trend of the time series of the unemployment rate was extracted, and the remaining component of the detrended unemployment rate was used for data analysis. With the detrended time series, a sequence of indicator variables was made to specify if the detrended unemployment rate is above a cutoff (indicator 1) or not (indicator 0) at each time. We tried three cutoffs, .08%, .21%, .28%, and three indicator sequences were obtained. In Figure 5, the graph on the right shows the detrended unemployment rates and horizontal lines at the three cutoffs.

Refer to caption
Refer to caption
Figure 5: Left: monthly unemployment rate (percent), Right: detrended unemployment rate with horizontal line at .08%, .21%, .28%.

For each indicator sequence, we checked its autocorrelation in the left graphs of Figure 6. It is observed that in the indicator sequence with cutoff .08, the correlation decreases relatively fast, and as the cutoff increases, the decay rate of the correlation slows down.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 6: Left: Autocorrelograms of the indicator sequence of detrended unemployment rate with cutoff .08% (top), .21% (middle), .28% (bottom).
Right: Autocorrelograms of return times in the indicator sequence with cutoff .08% (top), .21% (middle), .28% (bottom).

The GBP-I, GBP-II, Bernoulli process, and Markov chains were applied to each indicator sequence. In each of the four stochastic models, the return times, times between successive 1’s, are independent and identically distributed as

P(return time=k)\displaystyle P(\text{return time}=k)
={LH({1,2})I{k=1}+I{k>1}DH({1,k+1},{2,3,,k}) in GBP-I,cLH({1,2})I{k=1}+I{k>1}cDH({1,k+1},{2,3,,k}) in GBP-II,(1p)k1p in Bernoulli process,pI{k=1}+I{k>1}(1p)(1q)k2q in Markov chain,\displaystyle=\begin{dcases}L_{H}(\{1,2\})I_{\{k=1\}}+I_{\{k>1\}}D_{H}(\{1,k+1\},\{2,3,\cdots,k\})&\text{ in GBP-I,}\\ cL_{H}^{\circ}(\{1,2\})I_{\{k=1\}}+I_{\{k>1\}}cD_{H}^{\circ}(\{1,k+1\},\{2,3,\cdots,k\})&\text{ in GBP-II,}\\ (1-p)^{k-1}p&\text{ in Bernoulli process,}\\ pI_{\{k=1\}}+I_{\{k>1\}}(1-p)(1-q)^{k-2}q&\text{ in Markov chain,}\end{dcases} (7)

where kk\in\mathbb{N}, p,q(0,1)p,q\in(0,1), and I{k=0},I{k>0}I_{\{k=0\}},I_{\{k>0\}} are indicator variables.

In each indicator sequence, we checked whether the return times were correlated. Autocorrelograms on the right in Figure 6 show that at a 5% significance level, there is no significant evidence that return times are correlated. Therefore, we could proceed to fit each of the distributions in (7) to the return times of each indicator sequence.

The results on the MLE of the parameters and the AIC of each model are shown in Tables 1-3. From Table 1, when the cutoff is .08, the Markov chain shows the best fit, having the smallest AIC, followed by the GBP-I. Table 2 shows that with the cutoff of .21, GBP-I has the smallest AIC, although the difference from the second smallest AIC from the Markov chain is very small. In Table 3, when the cutoff is .28, the GBP-I best fits the data, followed by the GBP-II and the Markov chain, in that order. The graphs in Figure 7 show the histogram of the return time overlaid with the fitted probability models.

Probability model Parameters MLE AIC
GBP-I (p,H,cp,H,c) (.30, .11, .23) 974.68
GBP-II (H,cH,c) (.85, .56) 1027.04
Bernoulli process pp .31 1079.06
Markov chain (p,qp,q) (.57, .19) 961.55
Table 1: Indicator sequence with the cutoff .08
Probability model Parameters MLE AIC
GBP-I (p,H,cp,H,c) (.09, .46, .33) 479.76
GBP-II (H,cH,c) (.77, .45) 496.18
Bernoulli process pp .10 570.96
Markov chain (p,qp,q) (.47, .06) 480.38
Table 2: Indicator sequence with the cutoff .21
Probability model Parameters MLE AIC
GBP-I (p,H,cp,H,c) (.06, .58, .42) 340.56
GBP-II (H,cH,c) (.75, .53) 346
Bernoulli process pp .07 458.44
Markov chain (p,qp,q) (.53, .04) 348.41
Table 3: Indicator sequence with the cutoff .28
Refer to caption
Refer to caption
Refer to caption
Figure 7: Data and fitted distributions of the return time in the indicator sequence with cutoff .08 (top left), .21 (top right), .28 (bottom)

From the results, it is observed that the relative performance of the GBP-I to the Markov chain improves as the cutoff increases. It seems to be related to the fact that as the cutoff increases, the estimated parameter of HH in the GBP-I also increases. Especially, with the largest cutoff, the estimate of HH was .58.58, which indicates the presence of long-range dependence (LRD) in the indicator sequence. Since GBPs can incorporate LRD in a binary sequence unlike a Markov chain, it is not surprising that GBP shows a better fit than a Markov chain in the presence of LRD.

5.2 Application of fractional binomial distribution

We use a dataset in horticulture in Ridout et al. (1998). The dataset contains the number of roots produced by 270 micropropagated shoots of the columnar apple cultivar Trajans which were cultured under different experimental conditions, and the distribution of the number of roots is over-dispersed and has excess zeros. In Ridout et al. (1998), it was shown that zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models fitted the data better than Poisson and negative binomial models with covariates on the experimental conditions. Here we only use data on the number of roots without covariates, and fit fractional binomial models, ZIP, and ZINB. The results on the MLE of parameters and AIC of each model are provided in Table 4. The FB-I shows the lowest AIC, followed by the FB-II, zero-inflated models, FB-II, and FB-I, in that order. Figure 8 shows the fitted distribution of each model with the data distribution. It was observed that FB-I and FB-II fitted the data well over the entire range of distribution, whereas the zero-inflated models overestimated the variable around the middle of the range and underestimated it in the latter half of the range. FB-I, II show the worst fit to the data distribution, and also since the estimated pp is almost zero in FB-I, the fitted distribution of FB-I and FB-II becomes almost identical.

Probability model Parameters MLE AIC
FB-I (p,H,cp,H,c) (.30, .74 ,.31) 1348.44
FB-II (λ,H,c\lambda,H,c) (.47, .92, .57) 1350.1
ZINB (λ,r,π)(\lambda,r,\pi) (6.59, 9.97, .23) 1358.6
ZIP (λ,π\lambda,\pi) (6.62, .24) 1381.6
FB-II (H,cH,c) (.77, .70) 1420.24
FB-I (p,H,cp,H,c) (.00, .77, .70) 1422.24
Table 4: MLE and AIC for roots dataset
Refer to caption
Figure 8: Fitted and data distribution on the number of roots

6 Conclusion

We proposed generalized Bernoulli processes that are stationary binary sequences and found a connection to the fractional Poisson process. GBPs can possess long-range dependence, and the interarrival time of GBPs follows a heavy-tailed distribution. Since a GBP can have the same scaling limit as the fractional Poisson process, it can be considered as a discrete-time analog of the fractional Poisson process. Fractional binomial distributions are defined from the sum in GBPs and possess various shapes, from highly skewed to flat.

GBPs were applied to economic data with indicator variables, and the model fit of the GBPs was compared to the model fit of a Markov chain. It turned out that in the presence of LRD, the GBPs outperformed the Markov chain, which can be explained by the fact that the GBPs can incorporate LRD. It was noted that LRD appeared in the dataset when the higher cutoff was used for the indicator sequence, which suggests a connection between rare events and LRD, and this shows potential for the applicability of GBPs for modeling LRD in rare events.

Fractional binomial models were applied to count data with excess zeros. It was shown that a fractional binomial model fitted the data better than zero-inflated models that are extensively used for overdispersed, excess zero count data.

{supplement}\stitle

Generalized Bernoulli process and fractional Poisson process: supplemental document \sdescriptionAll the proofs of the proposition and theorems of this article are included in the supplementary material.

{supplement}\stitle

Computer code for simulations and data applications \sdescriptionCode for simulations can be found in J. Lee, frbinom, (2023), GitHub repository, https://github.com/leejeo25/frbinom. Computer code used in the application of Section 5 is provided in J. Lee, GBP_FB, (2023), GitHub repository, https://github.com/leejeo25/GBP_FB

References

  • Altham (1978) Altham, P. M. E. (1978). Two generalizations of the binomial distribution. Journal of the Royal Statistical Society. Series C (Applied Statistics) 27(2), 162–167.
  • Biard and Saussereau (2014) Biard, R. and B. Saussereau (2014). Fractional poisson process: Long-range dependence and applications in ruin theory. Journal of Applied Probability 51(03), 727–740.
  • Borges et al. (2012) Borges, P., J. Rodrigues, and N. Balakrishnan (2012). A class of correlated weighted poisson processes. Journal of Statistical Planning and Inference 142(1), 366–375.
  • Consul (1989) Consul, P. C. (1989). Generalized poisson distributions: Properties and applications. Dekker.
  • Conway and Maxwell (1962) Conway, R. W. and W. L. Maxwell (1962). A queuing model with state dependent service rates. Journal of Industrial Engineering 12, 132–136.
  • Kadane (2016) Kadane, J. B. (2016). Sums of possibly associated bernoulli variables: The conway–maxwell-binomial distribution. Bayesian Analysis 11(2).
  • Lambert (1992) Lambert, D. (1992). Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics 34(1), 1–14.
  • Laskin (2003) Laskin, N. (2003). Fractional poisson process. Communications in Nonlinear Science and Numerical Simulation 8(3), 201–213. Chaotic transport and compexity in classical and quantum dynamics.
  • Lee (2021a) Lee, J. (2021a). Generalized bernoulli process: simulation, estimation, and application. Dependence Modeling 9(1), 141–155.
  • Lee (2021b) Lee, J. (2021b). Generalized bernoulli process with long-range dependence and fractional binomial distribution. Dependence Modeling 9(1), 1–12.
  • Ridout et al. (1998) Ridout, M. S., C. G. B. Demétrio, and J. P. Hinde (1998). Models for count data with many zeros.
  • Rodrigues et al. (2013) Rodrigues, J., N. Balakrishnan, and P. Borges (2013). Markov-correlated poisson processes. Communications in Statistics - Theory and Methods 42(20), 3696–3703.
  • Samorodnitsky (2018) Samorodnitsky, G. (2018). Stochastic processes and long range dependence. Springer International Publishing.
  • Skellam (1948) Skellam, J. G. (1948). A probability distribution derived from the binomial distribution by regarding the probability of success as variable between the sets of trials. Journal of the Royal Statistical Society: Series B (Methodological) 10(2), 257–261.
  • Wedderburn (1974) Wedderburn, R. W. M. (1974). Quasi-likelihood functions, generalized linear models, and the gauss-newton method. Biometrika 61(3), 439–447.