This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Moments of the multivariate Beta distribution

Feng Zhao
Abstract

In this paper, we extend Beta distribution to 2 by 2 matrix and give the analytical formula for its moments. Our analytical formula can be used to analyze the asymptotic behavior of Beta distribution for 2 by 2 matrix.

keywords:
multivariate Beta distribution , higher moments

1 Introduction

Moments of probability distribution are an important topic in statistics. Given the moment sequence, the probability distribution is unique under some mind conditions. To prove the convergence of random variables, we can prove the convergence of its moment sequences instead. To accomplish such goal, the analytical form of moments is a prerequisite. The techniques to compute moments for different distributions differ. In this article, we focus on the Beta distribution of 2 by 2 matrix.

David introduces an extension of multivariate extension for Beta distribution, denoted as 𝐁​(Ξ±,Ξ²;Ip)\mathbf{B}(\alpha,\beta;I_{p}) (see [1]). It is a random pΓ—pp\times p symmetric matrix WW whose density function is given by

p​(w)\displaystyle p(w) =1Bp​(Ξ±,Ξ²)​|Iβˆ’w|Ξ±βˆ’p+12​|w|Ξ²βˆ’p+12​ where ​w,Iβˆ’w∈Sp,p++\displaystyle=\frac{1}{B_{p}(\alpha,\beta)}\lvert I-w\rvert^{\alpha-\frac{p+1}{2}}\lvert w\rvert^{\beta-\frac{p+1}{2}}\textrm{ where }w,I-w\in S_{p,p}^{++} (1)
Bp​(Ξ±,Ξ²)\displaystyle B_{p}(\alpha,\beta) =∫w,Iβˆ’w∈Sp,p+|Iβˆ’w|Ξ±βˆ’p+12​|w|Ξ²βˆ’p+12​𝑑w​ where ​α,Ξ²>pβˆ’12\displaystyle=\int_{w,I-w\in S_{p,p}^{+}}\lvert I-w\rvert^{\alpha-\frac{p+1}{2}}\lvert w\rvert^{\beta-\frac{p+1}{2}}dw\textrm{ where }\alpha,\beta>\frac{p-1}{2} (2)

Bp​(Ξ±,Ξ²)B_{p}(\alpha,\beta) is called the multivariate Beta function (see [5]); |W|\lvert W\rvert is the determinant of matrix WW and Sp,p++S_{p,p}^{++} is the collection of positive definite matrix. When p=1p=1, the distribution reduces to normal Beta distribution for 0<x<10<x<1.

This extension may have useful applications in multivariate statistical problems but little is known about the analytical property of such extension. For example, it is unknown whether the moments 𝔼​[f​(W)]\mathbb{E}[f(W)] can be written in concise form, where ff is a monomial about the positive-definite matrix WW.

Konno has derived the formula of the moment up to second order (see [4]). In this paper, we focus on the case p=2p=2 and deduce the analytical form of moments for 𝐁​(Ξ±,Ξ²;I2)\mathbf{B}(\alpha,\beta;I_{2}). This formula includes the expectation and variance, which are the first and second order moment respectively. Our moments formula, as far as we know, is novel and can be used directly in the computation related with multivariate Beta models instead of approximating numerical integration.

In this article, the following notation convention is adopted: W=(XZZY)W=\begin{pmatrix}X&Z\\ Z&Y\end{pmatrix} is the symmetric random matrix to be considered. Its density function is given by Equation (1), which can also be treated as the joint density function of X,Y,ZX,Y,Z. |W|=X​Yβˆ’Z2\lvert W\rvert=XY-Z^{2}. Let 𝔼α,β​[f​(X,Y,Z)]=∫f​(x,y,z)​p​(w)​𝑑w\mathbb{E}_{\alpha,\beta}[f(X,Y,Z)]=\int f(x,y,z)p(w)dw denotes the expectation with 𝐁​(Ξ±,Ξ²;I2)\mathbf{B}(\alpha,\beta;I_{2}) where f​(β‹…,β‹…,β‹…)f(\cdot,\cdot,\cdot) is an arbitrary function with three variables. We will compute 𝔼α,β​[f​(X,Y,Z)]\mathbb{E}_{\alpha,\beta}[f(X,Y,Z)] when f​(X,Y,Z)f(X,Y,Z) takes the monomial form: f​(X,Y,Z)=Xm​Yr​Z2​tf(X,Y,Z)=X^{m}Y^{r}Z^{2t}.

2 Marginal Distribution

In this section we will compute 𝔼α,β​[f​(X,Y,Z)]\mathbb{E}_{\alpha,\beta}[f(X,Y,Z)] for f​(X,Y,Z)=Xmf(X,Y,Z)=X^{m} and show that XX is one dimensional Beta distribution. To accomplish our goals, we need the following lemma:

Lemma 1.

Let A=X​Yβˆ’Z2,B=1βˆ’Xβˆ’Y+AA=XY-Z^{2},B=1-X-Y+A, then we have

𝔼α,β​[A​f​(X,Y,Z)]=\displaystyle\mathbb{E}_{\alpha,\beta}[Af(X,Y,Z)]= α​(Ξ±βˆ’1/2)(Ξ±+Ξ²)​(Ξ±+Ξ²βˆ’1/2)​𝔼α+1,β​[f​(X,Y,Z)]\displaystyle\frac{\alpha(\alpha-1/2)}{(\alpha+\beta)(\alpha+\beta-1/2)}\mathbb{E}_{\alpha+1,\beta}[f(X,Y,Z)] (3)
𝔼α,β​[B​f​(X,Y,Z)]=\displaystyle\mathbb{E}_{\alpha,\beta}[Bf(X,Y,Z)]= β​(Ξ²βˆ’1/2)(Ξ±+Ξ²)​(Ξ±+Ξ²βˆ’1/2)​𝔼α,Ξ²+1​[f​(X,Y,Z)]\displaystyle\frac{\beta(\beta-1/2)}{(\alpha+\beta)(\alpha+\beta-1/2)}\mathbb{E}_{\alpha,\beta+1}[f(X,Y,Z)] (4)
Proof.

For multivariate Beta function we have Bp​(a,b)=Ξ“p​(a)​Γp​(b)Ξ“p​(a+b)B_{p}(a,b)=\frac{\Gamma_{p}(a)\Gamma_{p}(b)}{\Gamma_{p}(a+b)} where Ξ“p\Gamma_{p} is the multivariate Gamma function (see [3]). For p=2p=2 we have Ξ“2​(a)=π​Γ​(a)​Γ​(aβˆ’1/2)\Gamma_{2}(a)=\sqrt{\pi}\Gamma(a)\Gamma(a-1/2).

𝔼α,β​[A​f​(X,Y,Z)]𝔼α+1,β​[f​(X,Y,Z)]\displaystyle\frac{\mathbb{E}_{\alpha,\beta}[Af(X,Y,Z)]}{\mathbb{E}_{\alpha+1,\beta}[f(X,Y,Z)]} =B2​(Ξ±+1,Ξ²)B2​(Ξ±,Ξ²)\displaystyle=\frac{B_{2}(\alpha+1,\beta)}{B_{2}(\alpha,\beta)}
=Ξ“2​(Ξ±+1)Ξ“2​(Ξ±)​Γ2​(Ξ±+Ξ²)Ξ“2​(Ξ±+Ξ²+1)\displaystyle=\frac{\Gamma_{2}(\alpha+1)}{\Gamma_{2}(\alpha)}\frac{\Gamma_{2}(\alpha+\beta)}{\Gamma_{2}(\alpha+\beta+1)}
=Γ​(Ξ±+1)Γ​(Ξ±)​Γ​(Ξ±+1/2)Γ​(Ξ±βˆ’1/2)​Γ​(Ξ±+Ξ²)Γ​(Ξ±+Ξ²+1)​Γ​(Ξ±+Ξ²βˆ’1/2)Γ​(Ξ±+Ξ²+1/2)\displaystyle=\frac{\Gamma(\alpha+1)}{\Gamma(\alpha)}\frac{\Gamma(\alpha+1/2)}{\Gamma(\alpha-1/2)}\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha+\beta+1)}\frac{\Gamma(\alpha+\beta-1/2)}{\Gamma(\alpha+\beta+1/2)}
=α​(Ξ±βˆ’1/2)(Ξ±+Ξ²)​(Ξ±+Ξ²βˆ’1/2)\displaystyle=\frac{\alpha(\alpha-1/2)}{(\alpha+\beta)(\alpha+\beta-1/2)}

Thus Equation (3) is proved and Equation (4) follows similarly. ∎

Using the above Lemma, we give the main conclusion of this section:

Theorem 1.

𝔼α,β​[Xm]=∏i=0mβˆ’1Ξ±+iΞ±+Ξ²+i\mathbb{E}_{\alpha,\beta}[X^{m}]=\prod_{i=0}^{m-1}\frac{\alpha+i}{\alpha+\beta+i}, and XX follows Beta distribution Beta​(Ξ±,Ξ²)\textrm{Beta}(\alpha,\beta).

Proof.

Since the position of XX and YY is symmetric, 𝔼α,β​[X]=𝔼α,β​[Y]\mathbb{E}_{\alpha,\beta}[X]=\mathbb{E}_{\alpha,\beta}[Y]. Taking the expectation about Beta2​(Ξ±,Ξ²)\textrm{Beta}_{2}(\alpha,\beta) on both sides of B=1βˆ’Xβˆ’Y+AB=1-X-Y+A and using the conclusion of Lemma 1, we have

β​(Ξ²βˆ’1/2)(Ξ±+Ξ²)​(Ξ±+Ξ²βˆ’1/2)=1βˆ’2​𝔼α,β​[X]+α​(Ξ±βˆ’1/2)(Ξ±+Ξ²)​(Ξ±+Ξ²βˆ’1/2)\frac{\beta(\beta-1/2)}{(\alpha+\beta)(\alpha+\beta-1/2)}=1-2\mathbb{E}_{\alpha,\beta}[X]+\frac{\alpha(\alpha-1/2)}{(\alpha+\beta)(\alpha+\beta-1/2)}

Solving the about equation we get 𝔼α,β​[X]=Ξ±Ξ±+Ξ²\mathbb{E}_{\alpha,\beta}[X]=\frac{\alpha}{\alpha+\beta}. Recursively using Equation (3) with f​(X,Y,Z)=Xf(X,Y,Z)=X we have 𝔼α,β​[Xm]=∏i=0mβˆ’1Ξ±+iΞ±+Ξ²+i\mathbb{E}_{\alpha,\beta}[X^{m}]=\prod_{i=0}^{m-1}\frac{\alpha+i}{\alpha+\beta+i}. This expression of moments is the same with that of Beta distribution on bounded interval [0,1][0,1], we conclude that XX is actually Beta distribution B​(Ξ±,Ξ²)B(\alpha,\beta). ∎

3 Mixed Moments

In this section, we further compute 𝔼α,β​[Xm​Yr​Zt]\mathbb{E}_{\alpha,\beta}[X^{m}Y^{r}Z^{t}]. By symmetric property 𝔼α,β​[Xm​Yr​Z2​t+1]=0\mathbb{E}_{\alpha,\beta}[X^{m}Y^{r}Z^{2t+1}]=0. Therefore we only need to consider the case when the power of ZZ is even. Firstly We consider the case when r=0r=0:

Theorem 2.
𝔼α,β​[Xm​Z2​t]=(2​tβˆ’1)!!2tβ€‹βˆi=0tβˆ’1Ξ²+i(Ξ±+Ξ²+iβˆ’1/2)β€‹βˆi=0t+mβˆ’1Ξ±+i∏i=02​t+mβˆ’1Ξ±+Ξ²+i\mathbb{E}_{\alpha,\beta}[X^{m}Z^{2t}]=\frac{(2t-1)!!}{2^{t}}\prod_{i=0}^{t-1}\frac{\beta+i}{(\alpha+\beta+i-1/2)}\frac{\prod_{i=0}^{t+m-1}\alpha+i}{\prod_{i=0}^{2t+m-1}\alpha+\beta+i} (5)
Proof.

We use induction to show Equation (5) is true. Firstly, Equation (5) is true for t=0t=0 from Theorem 1. Let A,BA,B be the same as those in Lemma 1. Suppose Equation (5) holds for 𝔼​[Z2​tβˆ’2​Xm]\mathbb{E}[Z^{2t-2}X^{m}], using Z2=X​Yβˆ’A=X​(1βˆ’Xβˆ’A+B)βˆ’AZ^{2}=XY-A=X(1-X-A+B)-A, then

𝔼α,β​[Z2​t​Xm]\displaystyle\mathbb{E}_{\alpha,\beta}[Z^{2t}X^{m}] =𝔼α,β​[Z2​tβˆ’2​Xm​(Xβˆ’X2+A​Xβˆ’B​Xβˆ’A)]\displaystyle=\mathbb{E}_{\alpha,\beta}[Z^{2t-2}X^{m}(X-X^{2}+AX-BX-A)]
=𝔼α,β​[Z2​tβˆ’2​(Xm+1βˆ’Xm+2)]+𝔼α,β​[A​Z2​tβˆ’2​(Xm+1βˆ’Xm)]\displaystyle=\mathbb{E}_{\alpha,\beta}[Z^{2t-2}(X^{m+1}-X^{m+2})]+\mathbb{E}_{\alpha,\beta}[AZ^{2t-2}(X^{m+1}-X^{m})]
βˆ’π”ΌΞ±,β​[B​Z2​tβˆ’2​Xm+1]\displaystyle-\mathbb{E}_{\alpha,\beta}[BZ^{2t-2}X^{m+1}]
=(1βˆ’Ξ±+t+mΞ±+Ξ²+2​t+mβˆ’1)​𝔼α,β​[Z2​tβˆ’2​Xm+1]\displaystyle=\left(1-\frac{\alpha+t+m}{\alpha+\beta+2t+m-1}\right)\mathbb{E}_{\alpha,\beta}[Z^{2t-2}X^{m+1}]
+α​(Ξ±βˆ’1/2)(Ξ±+Ξ²)​(Ξ±+Ξ²βˆ’1/2)​𝔼α+1,β​[Z2​tβˆ’2​(Xm+1βˆ’Xm)]\displaystyle+\frac{\alpha(\alpha-1/2)}{(\alpha+\beta)(\alpha+\beta-1/2)}\mathbb{E}_{\alpha+1,\beta}[Z^{2t-2}(X^{m+1}-X^{m})]
βˆ’Ξ²β€‹(Ξ²βˆ’1/2)(Ξ±+Ξ²)​(Ξ±+Ξ²βˆ’1/2)​𝔼α,Ξ²+1​[Z2​tβˆ’2​Xm+1]\displaystyle-\frac{\beta(\beta-1/2)}{(\alpha+\beta)(\alpha+\beta-1/2)}\mathbb{E}_{\alpha,\beta+1}[Z^{2t-2}X^{m+1}]
=Ξ²+tβˆ’1Ξ±+Ξ²+2​t+mβˆ’1​𝔼α,β​[Z2​tβˆ’2​Xm+1]\displaystyle=\frac{\beta+t-1}{\alpha+\beta+2t+m-1}\mathbb{E}_{\alpha,\beta}[Z^{2t-2}X^{m+1}]
+(Ξ±+t+m)​(Ξ±βˆ’1/2)(Ξ±+Ξ²+2​tβˆ’1+m)​(Ξ±+Ξ²+tβˆ’3/2)\displaystyle+\frac{(\alpha+t+m)(\alpha-1/2)}{(\alpha+\beta+2t-1+m)(\alpha+\beta+t-3/2)}
β‹…(1βˆ’Ξ±+Ξ²+2​(tβˆ’1)+m+1Ξ±+t+m)​𝔼α,β​[Z2​tβˆ’2​Xm+1]\displaystyle\cdot\left(1-\frac{\alpha+\beta+2(t-1)+m+1}{\alpha+t+m}\right)\mathbb{E}_{\alpha,\beta}[Z^{2t-2}X^{m+1}]
βˆ’(Ξ²+tβˆ’1)​(Ξ²βˆ’1/2)(Ξ±+Ξ²+2​tβˆ’1+m)​(Ξ±+Ξ²+tβˆ’3/2)​𝔼α,β​[Z2​tβˆ’2​Xm+1]\displaystyle-\frac{(\beta+t-1)(\beta-1/2)}{(\alpha+\beta+2t-1+m)(\alpha+\beta+t-3/2)}\mathbb{E}_{\alpha,\beta}[Z^{2t-2}X^{m+1}]
=(tβˆ’1/2)​(Ξ²+tβˆ’1)(Ξ±+Ξ²+tβˆ’3/2)​(Ξ±+Ξ²+2​t+mβˆ’1)​𝔼α,β​[Z2​tβˆ’2​Xm+1]\displaystyle=\frac{(t-1/2)(\beta+t-1)}{(\alpha+\beta+t-3/2)(\alpha+\beta+2t+m-1)}\mathbb{E}_{\alpha,\beta}[Z^{2t-2}X^{m+1}]

Using Equation (5) for tβˆ’1t-1 we can get the same form of expression for tt. ∎

From Theorem 2, we can get the general formula for the mixed moment when mβ‰₯rm\geq r:

Corollary 1.
𝔼α,β​[Xm​Yr​Z2​t]\displaystyle\mathbb{E}_{\alpha,\beta}[X^{m}Y^{r}Z^{2t}] =(2​tβˆ’1)!!2tβ€‹βˆj=0tβˆ’1(Ξ²+j)∏j=0t+rβˆ’1(Ξ±+Ξ²βˆ’1/2+j)β€‹βˆj=0t+mβˆ’1(Ξ±+j)∏j=02​t+mβˆ’1(Ξ±+Ξ²+j)\displaystyle=\frac{(2t-1)!!}{2^{t}}\frac{\prod_{j=0}^{t-1}(\beta+j)}{\prod_{j=0}^{t+r-1}(\alpha+\beta-1/2+j)}\frac{\prod_{j=0}^{t+m-1}(\alpha+j)}{\prod_{j=0}^{2t+m-1}(\alpha+\beta+j)}
β‹…βˆ‘i=0r12i(ri)∏j=1i(2tβˆ’1+2j)∏j=0rβˆ’iβˆ’1(Ξ±βˆ’1/2+j)β€‹βˆj=0iβˆ’1(Ξ²+j+t)∏j=0iβˆ’1(Ξ±+Ξ²+j+2​t+m)\displaystyle\cdot\sum_{i=0}^{r}\frac{1}{2^{i}}\binom{r}{i}\prod_{j=1}^{i}(2t-1+2j)\frac{\displaystyle\prod_{j=0}^{r-i-1}(\alpha-1/2+j)\prod_{j=0}^{i-1}(\beta+j+t)}{\prod_{j=0}^{i-1}(\alpha+\beta+j+2t+m)} (6)

Since 𝔼α,β​[Xm​Yr​Z2​t]=𝔼α,β​[Xr​Ym​Z2​t]\mathbb{E}_{\alpha,\beta}[X^{m}Y^{r}Z^{2t}]=\mathbb{E}_{\alpha,\beta}[X^{r}Y^{m}Z^{2t}], when m<rm<r we can exchange mm with rr and then use Corollary 1.

Proof.

From Lemma 1, we have Xm​Yr​Z2​t=Xmβˆ’r​(A+Z2)r​Z2​tX^{m}Y^{r}Z^{2t}=X^{m-r}(A+Z^{2})^{r}Z^{2t}. Using binomial theorem we have 𝔼α,β​[Xm​Yr​Z2​t]=βˆ‘i=0r(ri)​𝔼α,β​[Arβˆ’i​Xmβˆ’r​Z2​(t+i)]\mathbb{E}_{\alpha,\beta}[X^{m}Y^{r}Z^{2t}]=\sum_{i=0}^{r}\binom{r}{i}\mathbb{E}_{\alpha,\beta}[A^{r-i}X^{m-r}Z^{2(t+i)}]. Then using Equation (3) recursively we have

𝔼α,β​[Arβˆ’i​Xmβˆ’r​Z2​(t+i)]=\displaystyle\mathbb{E}_{\alpha,\beta}[A^{r-i}X^{m-r}Z^{2(t+i)}]= ∏j=0rβˆ’iβˆ’1(Ξ±+j)​(Ξ±+jβˆ’1/2)(Ξ±+j+Ξ²)​(Ξ±+j+Ξ²βˆ’1/2)\displaystyle\prod_{j=0}^{r-i-1}\frac{(\alpha+j)(\alpha+j-1/2)}{(\alpha+j+\beta)(\alpha+j+\beta-1/2)}
β‹…\displaystyle\cdot 𝔼α+rβˆ’i,β​[Xmβˆ’r​Z2​(t+i)]\displaystyle\mathbb{E}_{\alpha+r-i,\beta}[X^{m-r}Z^{2(t+i)}]

Using Theorem 2 we can finally get the expression in Equation (6). ∎

4 Case Study

In this section, we will give a natural example which illustrates how our result can be used. We will consider the random matrix S=Q​QTS=QQ^{T} where QQ is nΓ—kn\times k random orthogonal matrix. We are interested in how 𝔼​[S11m​S122​t]\mathbb{E}[S_{11}^{m}S_{12}^{2t}] changes as kβ†’βˆžk\to\infty when r=knr=\frac{k}{n} is fixed.

From Proposition 7.2 of [2], (S11S12S21S22)\begin{pmatrix}S_{11}&S_{12}\\ S_{21}&S_{22}\end{pmatrix} is exactly 2 by 2 random matrix of Beta distribution with parameter B​(k2,nβˆ’k2,I2)B(\frac{k}{2},\frac{n-k}{2},I_{2}). Using Theorem 2, we could write 𝔼​[S11m​S122​t]∼(2​tβˆ’1)!!2t​rt​(1βˆ’t)t+mnt=O​(nβˆ’t)\mathbb{E}[S_{11}^{m}S_{12}^{2t}]\sim\frac{(2t-1)!!}{2^{t}}\frac{r^{t}(1-t)^{t+m}}{n^{t}}=O(n^{-t}). That is, 𝔼​[S11m​S122​t]\mathbb{E}[S_{11}^{m}S_{12}^{2t}] decreases in the order of nβˆ’tn^{-t}.

5 Conclusion

We have derived the formula of moments for multivariate Beta distribution of 2 by 2 matrix. This result is helpful for analyzing other statistical properties of multivariate Beta distribution.

References

  • [1] A.Β P. Dawid. Some matrix-variate distribution theory: Notational considerations and a bayesian application. Biometrika, 68(1):265–274, 1981.
  • [2] MorrisΒ L. Easton. Chapter 7: Random orthogonal matrices, volume Volume 1 of Regional Conference Series in Probability and Statistics, pages 100–107. Institute of Mathematical Statistics and American Statistical Association, Haywood CA and Alexandria VA, 1989.
  • [3] A.Β E. Ingham. An integral which occurs in statistics. Mathematical Proceedings of the Cambridge Philosophical Society, 29(2):271–276, 1933.
  • [4] Yoshihiko Konno. Exact moments of the multivariate f and beta distributions. Journal of the Japan Statistical Society, 18(2):123–130, 1988.
  • [5] CarlΒ Ludwig Siegel. Über die analytische theorie der quadratischen formen. Annals of Mathematics, 36(3):527–606, 1935.