A new method for estimating the tail index using truncated sample sequence ^∗

Fuquan Tang Dong Han^⋆
Department of Statistics, School of Mathematical Sciences,
Shanghai Jiao Tong University, Shanghai, 200240, China

ABSTRACT\\

This article proposes a new method of truncated estimation to estimate the tail index $\alpha$ of the extremely heavy-tailed distribution with infinite mean or variance. We not only present two truncated estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ for estimating $\alpha$ ( $0<\alpha\leq 1$ ) and $\alpha$ ( $1<\alpha\leq 2$ ) respectively, but also prove their asymptotic statistical properties. The numerical simulation results comparing the six known estimators in estimating error, the Type I Error and the power of estimator show that the performance of the two new truncated estimators is quite good on the whole.

^†^†footnotetext:
^∗Supported by National Natural Science Foundation of China (11531001)
^⋆ Corresponding author, E-mail: [email protected]

KEYWORDS: Heavy-tailed distributions, tail index, truncated sample mean, simulation.

1. Introduction

Heavy-tailed phenomena are widespread in many aspects of our lives, and exist in a variety of disciplines such as physics, meteorology, computer science, biology, and finance. The probabilistic and statistical methods and theories about the heavy-tailed phenomenon have been used to study the magnitude of earthquakes, the diameter of lunar craters on the surface of the moon, the size of interplanetary fragments, and the frequency of words in human languages, and so on [References, References, References, References].

Geography and hydrology are important scenarios for the study and application of thick-tailed distribution. In 1998, Anderson [References] discussed heavy tail time series models and provided a periodic ARMA model for Salt River. In 2022, Merz et al.[References] provided a detailed and coherent review on understanding heavy tails of flood peak distributions, they proposed nine hypotheses on the mechanisms generating heavy-tailed phenomena in flood system. In financial markets, Mandelbrot[References] presented seminal research on cotton price using the heavy tails distribution theory. In 2013, Marat et al.[References] found the emerging exchange markets would be more pronouncedly heavy-tailed and illustrated that heavy-tailed properties did not change obviously during the financial and economic crisis period.

There is a large literature proposing numerous ideas and methods on the estimation of the tail index $\alpha$ of heavy-tailed distribution. The size of $\alpha$ is mainly used to measure the degree of thinness of the tail. The smaller the $\alpha$ , the higher the probability of a heavy-tailed event. Since Hill put forward the famous Hill estimator in 1975 [References], researchers have provided multiple estimation methods for estimating $\alpha$ , such as DPR estimator[References, References], QQ estimator[References], the Moment estimator[References], $L^{p}$ quantile estimator[References], the estimators of extreme value index in a censorship framework[References, References, References, References], t-Hill estimator[References, References], IPO estimator[References] and so on. There are more than 100 tail index estimators have been reviewed by two papers [References, References].

It can be seen that nearly all estimators based on the order statistics of observation samples. Moreover, the estimators based on the order statistics have three characteristics that are not very satisfactory: (1) The calculation of the estimators is relatively complex since the order statistics are not easy to calculate for large sample size; (2) The mathematical meaning of the estimators for the tail index is not obvious; (3) There is no explicit expression for the rate of strong consistency convergence of the estimators.

In order to make up for the shortcomings of existing estimation methods, we propose a new truncated estimation method to estimate the tail index $\alpha$ ( $0<\alpha\leq 2$ ) of heavy-tailed distribution with infinite mean or variance. The proposed two estimators $\hat{\alpha}$ for $0<\alpha\leq 1$ and $\hat{\alpha}^{\prime}$ for $1<\alpha\leq 2$ , are based on the truncated sample mean and the truncated sample second order moment, respectively, and they are not only relatively easy to calculate, but also their strong consistency convergence rate and the asymptotic normal property can be obtained.

In Section 2, we will present two truncated estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ , and obtain their asymptotic statistical properties. Section 3 compares the two truncated estimators with the six known estimators in estimating error, the type I error and the power of estimator by numerical simulations. Section 4 provides concluding remarks. The proofs of the three theorems are given in the Appendix.

2. Two truncated estimators

Due to a random variable $X$ can be written as the summation of positive and negative parts $X=X^{+}-X^{-}$ , we consider only the nonnegative random variables in the paper. Let $X_{k},k\geq 1,$ be independent and identical distribution (i.i.d.) with extremely heavy-tailed distribution function $F(x)=1-1/x^{\alpha}$ for $x\geq 1$ , where the tail index, $\alpha\in(0,\,2]$ , is unknown. We know that when $\alpha\in(0,\,1]$ , the mean or variance is infinite, and when $\alpha\in(1,\,2]$ , the mean is finite but the variance is infinite.

In this section, we will present two truncated estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ to estimate $\alpha$ ( $0<\alpha\leq 1$ ) and $\alpha$ ( $1<\alpha\leq 2$ ) respectively and prove their asymptotic statistical properties.

To this end, let $\{b_{n}\}$ be a positive truncated sequence satisfying $b_{n}\nearrow\infty$ as $n\to\infty$ . Define the truncated random variable $X_{k}(b_{n}):=X_{k}I(X_{k}\leq b_{n})$ , where $I(\cdot)$ is the indicator function. We can get the truncated mean $\mu_{n}$ , the truncated sample mean $\hat{\mu}_{n}$ , truncated second order moment $\nu^{2}_{n}$ and the truncated sample second order moment $\hat{\nu}^{2}_{n}$ in the following.

\displaystyle\mu_{n}:=\text{E}(X_{k}(b_{n}))=\frac{\alpha}{1-\alpha}\Big{(}b_{n}^{1-\alpha}-1\Big{)},\,\,\,\,\,\,\,\hat{\mu}_{n}:=n^{-1}\sum_{k=1}^{n}X_{k}(b_{n})

(1)

for $0<\alpha<1$ and

\displaystyle\nu^{2}_{n}:=\text{E}(X^{2}_{k}(b_{n}))=\frac{\alpha}{2-\alpha}\Big{(}b_{n}^{2-\alpha}-1\Big{)},\,\,\,\,\,\,\,\hat{\nu}^{2}_{n}:=n^{-1}\sum_{k=1}^{n}X^{2}_{k}(b_{n})

(2)

for $1<\alpha<2$ . It follows from equation(1) and (2) that

\displaystyle\alpha=1-\frac{\log[\frac{1-\alpha}{\alpha}\mu_{n}+1]}{\ln b_{n}}

(3)

for $0<\alpha<1$ and

\displaystyle\alpha=2-\frac{\log[\frac{2-\alpha}{\alpha}\nu^{2}_{n}+1]}{\ln b_{n}}

(4)

for $1<\alpha<2$ .

Hence, we can define two truncated estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ which satisfy the following two equations $x=G_{1}(x)$ for $0<x<1$ and $y=G_{2}(y)$ for $1<y<2$ respectively, by replacing $\alpha$ , $\mu_{n}$ and $\nu^{2}_{n}$ in equation(3) and (4) with $\hat{\alpha}$ , $\hat{\mu}_{n},$ $\hat{\alpha}^{\prime}$ and $\hat{\nu}^{2}_{n}$ , respectively, that is,

\displaystyle\hat{\alpha}=G_{1}(\hat{\alpha}):=1-\frac{\ln[\frac{1-\hat{\alpha}}{\hat{\alpha}}\hat{\mu}_{n}+1]}{\ln b_{n}}

(5)

for $0<\hat{\alpha}<1$ and

\displaystyle\hat{\alpha}^{\prime}=G_{2}(\hat{\alpha}^{\prime}):=2-\frac{\ln[\frac{2-\hat{\alpha}^{\prime}}{\hat{\alpha}^{\prime}}\hat{\nu}^{2}_{n}+1]}{\ln b_{n}}

(6)

for $1<\hat{\alpha}^{\prime}<2$ .

Take $\hat{\alpha}\nearrow 1$ in equation(5) and $\hat{\alpha}^{\prime}\nearrow 2$ in equation(6), respectively, it follows that $\hat{\alpha}\sim\hat{\mu}_{n}/\ln b_{n}$ and $\hat{\alpha}^{\prime}\sim\hat{\nu}^{2}_{n}/\ln b_{n}$ . Hence, we can use the following two estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ to estimate $\alpha=1$ and $\alpha=2$ , respectively.

\displaystyle\hat{\alpha}:=\frac{\hat{\mu}_{n}}{\ln b_{n}},

(7)

\displaystyle\hat{\alpha}^{\prime}:=\frac{\hat{\nu}^{2}_{n}}{\ln b_{n}}.

(8)

Since it is difficult to obtain the analytic solutions (estimators) $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ to the two equations $\hat{\alpha}-G_{1}(\hat{\alpha})=0$ and $\hat{\alpha}^{\prime}-G_{2}(\hat{\alpha}^{\prime})=0$ respectively, we present two recursive estimators for $k\geq 1$ in the following

\displaystyle\hat{\alpha}_{k}

\displaystyle=

\displaystyle G_{1}(\hat{\alpha}_{k-1}),\,\,\,\,\,\,\,\,\text{ if }\,\,0<\hat{\alpha}_{0}<1

(9)

for $0<\alpha\leq 1$ and

\displaystyle\hat{\alpha}^{\prime}_{k}

\displaystyle=

\displaystyle G_{2}(\hat{\alpha}^{\prime}_{k-1}),\,\,\,\,\,\,\,\,\text{ if }\,\,1<\hat{\alpha}^{\prime}_{0}<2

(10)

for $1<\alpha\leq 2$ , where $\hat{\alpha}_{0}$ and $\hat{\alpha}^{\prime}_{0}$ are two constants.

The following theorem shows that the two estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ can be approximated by the two sequences of estimators $\{\hat{\alpha}_{k}\}$ and $\{\hat{\alpha}^{\prime}_{k}\}$ , respectively.

Theorem 1.

Let $0\leq\beta<1/2$ and $b_{n}$ satisfy $b_{n}^{\alpha}\ln b_{n}\leq\alpha n^{1-2\beta}/\ln n$ for $0<\alpha<2,\alpha\neq 1$ and large n. Then, both the two equations $x-G_{1}(x)=0,0<x<1$ , and $y-G_{2}(y)=0$ , $1<y<2$ , have unique solutions $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ , respectively. If $0<\hat{\alpha}_{0}<\hat{\alpha}$ and $1<\hat{\alpha}_{0}^{\prime}<\hat{\alpha}^{\prime}<2$ (or $0<\hat{\alpha}<\hat{\alpha}_{0}<1$ and $1<\hat{\alpha}^{\prime}<\hat{\alpha}_{0}^{\prime}<2$ ), then $\hat{\alpha}_{k}\nearrow\hat{\alpha}$ and $\hat{\alpha}_{k}^{\prime}\nearrow\hat{\alpha}^{\prime}$ (or $\hat{\alpha}_{k}\searrow\hat{\alpha}$ and $\hat{\alpha}_{k}^{\prime}\searrow\hat{\alpha}^{\prime}$ ) and

\displaystyle\left|\hat{\alpha}-\hat{\alpha}_{k}\right|\leq\left(\frac{1}{\hat{\alpha}^{\star}(1-\hat{\alpha})\ln b_{n}}\right)^{k}\left|\hat{\alpha}-\hat{\alpha}_{0}\right|

(11)

\displaystyle\left|\hat{\alpha}^{\prime}-\hat{\alpha}_{k}^{\prime}\right|\leq\left(\frac{2}{\hat{\alpha}^{\star\prime}\left(2-\hat{\alpha}^{\prime}\right)\ln b_{n}}\right)^{k}\left|\hat{\alpha}^{\prime}-\hat{\alpha}_{0}^{\prime}\right|

(12)

where $\hat{\alpha}^{\star}=\min\left\{\hat{\alpha},\hat{\alpha}_{0}\right\}$ and $\hat{\alpha}^{\star\prime}=\min\left\{\hat{\alpha}^{\prime},\hat{\alpha}_{0}^{\prime}\right\}$ .

Remark 1.

Take large $n$ such that $A:=\hat{\alpha}^{\star}(1-\hat{\alpha})\ln b_{n}>1$ and $B:=\hat{\alpha}^{\star\prime}\left(2-\hat{\alpha}^{\prime}\right)\ln b_{n}/2>1$ . Note that $\left|\hat{\alpha}-\hat{\alpha}_{0}\right|\leq 1$ and $\left|\hat{\alpha}^{\prime}-\hat{\alpha}_{0}^{\prime}\right|\leq 1$ . It follows from the two inequalities (11) and (12) that

\left|\hat{\alpha}-\hat{\alpha}_{k}\right|\leq e^{-k\ln A},\quad\left|\hat{\alpha}^{\prime}-\hat{\alpha}_{k}^{\prime}\right|\leq e^{-k\ln B}.

The two inequalities above implies that $\left\{\hat{\alpha}_{k}\right\}$ and $\left\{\hat{\alpha}_{k}^{\prime}\right\}$ can converge (almost everywhere) at least exponentially to $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ , respectively.

Remark 2.

If we don’t know whether $\alpha$ is included in interval $(0,1]$ or interval $(1,2]$ , we may take the initial values $\hat{\alpha}_{0}$ and $\hat{\alpha}^{\prime}_{0}$ in the following: Take $n_{0}$ samples (for example, $n_{0}=50$ ) such that $\hat{\alpha}_{0}=\frac{\hat{\mu}_{n_{0}}}{\ln b_{n_{0}}}$ for $\frac{\hat{\mu}_{n_{0}}}{\ln b_{n_{0}}}\leq 1$ and $\hat{\alpha}^{\prime}_{0}=1.5$ for $\frac{\hat{\mu}_{n_{0}}}{\ln b_{n_{0}}}>1$ .

In order to get the asymptotic statistical properties $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ , we give a theorem in the following, which describes the asymptotic statistical properties of the truncated sample mean $\hat{\mu}_{n}$ and the truncated second order moment $\hat{\nu}_{n}^{2}$ .

Theorem 2.

Assume that the conditions of Theorem 1 hold. Then

\displaystyle\textbf{P}\Big{(}\frac{|\hat{\mu}_{n}-\mu_{n}|}{\mu_{n}}\geq\frac{2}{n^{\beta}\sqrt{\ln b_{n}}}\Big{)}\leq\frac{2}{n^{2}},\,\,\,\,\,\,\,\,\,\,\frac{\sqrt{n}(\hat{\mu}_{n}-\mu_{n})}{b_{n}^{1-\alpha/2}}\Rightarrow N\Big{(}0,\,\,\frac{\alpha}{2-\alpha}\Big{)}

(13)

for $0<\alpha\leq 1$ and

\displaystyle\textbf{P}\Big{(}\frac{|\hat{\nu}^{2}_{n}-\nu^{2}_{n}|}{\nu^{2}_{n}}\geq\frac{2}{n^{\beta}\sqrt{\ln b_{n}}}\Big{)}\leq\frac{2}{n^{2}},\,\,\,\,\,\,\,\,\,\,\frac{\sqrt{n}(\hat{\nu}^{2}_{n}-\nu^{2}_{n})}{b_{n}^{2-\alpha/2}}\Rightarrow N\Big{(}0,\,\,\frac{\alpha}{4-\alpha}\Big{)}

(14)

for $1<\alpha\leq 2$ , where ” $\Rightarrow$ ” denotes the convergence in distribution and $N(\mu,\sigma^{2})$ is the normal distribution.

The following theorem gives the asymptotic statistical properties of the two truncated estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ .

Theorem 3.

Assume that the conditions of Theorem 1 hold. Then

\displaystyle\textbf{P}\Big{(}|\hat{\alpha}-\alpha|\geq\frac{2}{n^{\beta}\ln b_{n}\sqrt{\ln b_{n}}}\Big{)}\leq\frac{2}{n^{2}},\,\,\,\,\,\,\,\,\,\,\frac{\sqrt{n}(\hat{\alpha}-\alpha)\ln b_{n}}{b_{n}^{\alpha/2}}\Rightarrow N\Big{(}0,\,\,\frac{(1-\alpha)^{2}}{\alpha(2-\alpha)}\Big{)}

(15)

for $0<\alpha<1$ and

\displaystyle\textbf{P}\Big{(}|\hat{\alpha}^{\prime}-\alpha|\geq\frac{2}{n^{\beta}\ln b_{n}\sqrt{\ln b_{n}}}\Big{)}\leq\frac{2}{n^{2}},\,\,\,\,\,\,\,\,\,\,\frac{\sqrt{n}(\hat{\alpha}^{\prime}-\alpha)\ln b_{n}}{b_{n}^{\alpha/2}}\Rightarrow N\Big{(}0,\,\,\frac{(2-\alpha)^{2}}{\alpha(4-\alpha)}\Big{)}

(16)

for $1<\alpha<2$ . Moreover,

\displaystyle\textbf{P}\Big{(}|\hat{\alpha}-1|\geq\frac{2\sqrt{2}}{n^{\beta}\ln b_{n}}\Big{)}\leq\frac{2}{n^{2}},\,\,\,\,\,\,\,\,\,\,\frac{\sqrt{n}(\hat{\alpha}-1)\ln b_{n}}{\sqrt{b_{n}}}\Rightarrow N(0,\,1)

(17)

for $\alpha=1$ and $b_{n}\leq n^{1-2\beta}/\ln n$ , and

\displaystyle\textbf{P}\Big{(}|\hat{\alpha}^{\prime}-2|\geq\frac{2\sqrt{2}}{n^{\beta}\ln b_{n}}\Big{)}\leq\frac{2}{n^{2}},\,\,\,\,\,\,\,\,\,\,\frac{\sqrt{n}(\hat{\alpha}^{\prime}-2)\ln b_{n}}{b_{n}}\Rightarrow N(0,\,1)

(18)

for $\alpha=2$ and $b^{2}_{n}\leq n^{1-\beta}/\ln n$ .

3. Numerical Simulations

In this section, we will compare our two estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ with other five estimators in the estimating error, the Type I Error and the power, including the Hill estimator[References], QQ estimator[References], the Moment estimator[References], t-Hill estimator[References, References] and t-lgHill estimator[References]. Since the asymptotic distribution of IPO estimator[References] is unknown, we only give the estimating error of the IPO estimator in Section 3.1. Excepting the two truncated estimators, the other six estimators can be written as

\displaystyle\begin{aligned} \hat{\alpha}_{H}^{-1}&=&\frac{1}{m}\sum_{i=1}^{m}\log\left(\frac{X_{(n-i+1)}}{X_{(n-m)}}\right),\end{aligned}

\displaystyle\begin{aligned} \hat{\alpha}_{Q}^{-1}&=&\frac{\sum_{j=1}^{m}\log((m+1)/j)\log X_{(n-j+1)}-m^{-1}\sum_{j=1}^{m}\log((m+1)/j)\sum_{j=1}^{m}\log X_{(n-j+1)}}{\sum_{j=1}^{m}\log^{2}((m+1)/j)-m^{-1}\left(\sum_{j=1}^{m}\log((m+1)/j)\right)^{2}},\end{aligned}

\displaystyle\hat{\alpha}_{M}^{-1}=M_{m,n}^{(1)}+1-\frac{1}{2}\left(1-\frac{\left(M_{m,n}^{(1)}\right)^{2}}{M_{m,n}^{(2)}}\right)^{-1},

\displaystyle\hat{\alpha}_{tH}^{-1}=\left(\frac{1}{m}\sum_{i=1}^{m}\frac{\mathbf{X}_{(m+1,n)}}{\mathbf{X}_{(i,n)}}\right)^{-1}-1,

\displaystyle\hat{\alpha}_{tlH}^{-1}=\frac{M_{n}^{(2)}-\left(M_{n}^{(1)}\right)^{2}}{M_{n}^{(1)}}

and

\hat{\alpha}_{IPO}^{-1}=-\frac{\log\left\{F_{n}^{\leftarrow}\left(\frac{3}{4}\right)+3\left[F_{n}^{\leftarrow}\left(\frac{3}{4}\right)-F_{n}^{\leftarrow}\left(\frac{1}{4}\right)\right]\right\}}{\log\hat{p}_{R}(0.25,n)},

where $X_{(1)}\leq X_{(2)}\leq\cdots\leq X_{(n)}$ are the order statistics of $X_{1},X_{2},\cdots,X_{n}$ . $m=m(n)$ , $m(n)/n\rightarrow 0$ as $n\rightarrow\infty$ , $m(n)=1,2,\ldots,n-1$ . $M_{m,n}^{(l)}$ is defined as

M_{m,n}^{(l)}=\frac{1}{m}\sum_{i=1}^{m}\left(\log\frac{X_{(n-i)}}{X_{\left(n-m\right)}}\right)^{l},\quad l=1,2

and the definitions of $F_{n}^{\leftarrow}(p)$ and $\hat{p}_{R}(p,n)$ in detail are given in the paper[References].

3.1. The estimating error

In our next simulations, let $n=10000$ be the number of samples in each trial and set the truncated sequence be $b_{n}=n^{q}$ in the truncated estimator $\hat{\alpha}$ or $\hat{\alpha}^{\prime}$ , where $q$ is the index of $b_{n}$ satisfying $0<$ $\alpha$ $q<1$ . In order to obtain the simulation searching accuracy for the two truncated estimators, we take $k$ as the number of iterations (see (11) and (12) in Theorem 1) such that $\left|\hat{\alpha}-\hat{\alpha}_{k}\right|\leq\epsilon=0.001$ or $\left|\hat{\alpha}^{\prime}-\hat{\alpha}_{k}^{\prime}\right|\leq\epsilon=0.001$ . Remark 2 provides a method on how to determine the initial values $\hat{\alpha}_{0}$ and $\hat{\alpha}^{\prime}_{0}$ .

We first consider $\alpha\in(0,1]$ and take the initial value $\hat{\alpha}_{0}=0.5$ . Let $m=m(n)=\sqrt{n}=100$ for the four estimators $\hat{\alpha}_{H}$ , $\hat{\alpha}_{Q}$ , $\hat{\alpha}_{M}$ and $\hat{\alpha}_{tH}$ and set $m=m(n)={0.5*n}=5000$ for the $\hat{\alpha}_{tlH}$ . We set $p=0.25$ for $\hat{\alpha}_{IPO}$ .

The following table 1 and figure 1 illustrate the numerical simulation results for the seven estimators. All the numerical simulation results in this section were obtained using $N=10^{3}$ repetitions.

Table 1: . The estimation for different

\alpha\in(0,1]

Parameters			Estimation
$\alpha$	q	k		$\hat{\alpha}$	$\hat{\alpha}_{H}$	$\hat{\alpha}_{Q}$	$\hat{\alpha}_{M}$	$\hat{\alpha}_{tH}$	$\hat{\alpha}_{tlH}$	$\hat{\alpha}_{IPO}$
0.10	2.00	11		0.101	0.101	0.097	0.101	0.091	0.100	0.157
0.20	2.00	6		0.200	0.201	0.194	0.202	0.190	0.201	0.216
0.30	2.00	5		0.301	0.299	0.289	0.303	0.287	0.304	0.304
0.40	1.80	4		0.402	0.404	0.386	0.408	0.391	0.410	0.403
0.50	1.70	4		0.502	0.496	0.481	0.510	0.488	0.517	0.503
0.60	1.50	5		0.603	0.595	0.583	0.620	0.586	0.624	0.606
0.70	1.30	7		0.704	0.705	0.677	0.725	0.686	0.727	0.710
0.80	1.20	9		0.803	0.793	0.771	0.831	0.784	0.827	0.816
0.90	1.10	12		0.904	0.895	0.868	0.943	0.884	0.923	0.925
			AE	0.002	0.004	0.017	0.016	0.013	0.015	0.016

The AE in the last row of the table 1 denotes the average of estimating error, the smaller the mean error, the better the estimator. We define $\mathbf{AE}=\sum\left|\hat{\alpha}_{Z}-\alpha\right|/9$ , where $\hat{\alpha}_{Z}$ denotes one of estimators $\hat{\alpha}$ , $\hat{\alpha}_{H}$ , $\hat{\alpha}_{Q}$ , $\hat{\alpha}_{M}$ , $\hat{\alpha}_{tH}$ , $\hat{\alpha}_{tlH}$ and $\hat{\alpha}_{IPO}$ for $\alpha$ =0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90.

Refer to caption — Figure 1: . The estimation for different $\alpha\in(0,1)$ .

It can be seen from the table 1 and the figure 1 that the estimating errors $\left|\hat{\alpha}-\alpha\right|$ of $\hat{\alpha}$ are smaller than that of other six estimators for $\alpha=0.20,0.30,0.40,0.50,0.60,0.70,0.80,0.90$ . Only for $\alpha=0.10$ , the estimating error $\left|\hat{\alpha}-0.01\right|=0.001$ of $\hat{\alpha}$ is larger than that of $\hat{\alpha}_{tlH}$ since $\left|\hat{\alpha}_{tlH}-0.10\right|=0$ . When $\alpha=0.10$ or $\alpha=0.20$ , the estimating error of $\hat{\alpha}_{IPO}$ is large than that of other six estimators. Obviously, the average of estimating error AE (0.02) of $\hat{\alpha}$ is smallest among the seven estimators. That is, we can say that the estimator $\hat{\alpha}$ has the best performance in estimating $\alpha\in(0,1]$ among the seven estimators.

Next, we consider $\alpha\in(1,2]$ . Take the initial value $\hat{\alpha}^{\prime}_{0}=1.5$ . Similar to $\alpha\in(0,1]$ , let $n=10000$ be the number of samples. Let the truncated sequence be $b_{n}=n^{q}$ in the truncated estimator $\hat{\alpha}$ , where $q$ is the index of $b_{n}$ satisfying $0<\alpha$ $q<1$ .

Table 2: . The estimation for different

\alpha\in(1,2)

Parameters			Estimation
$\alpha$	q	k		$\hat{\alpha}^{\prime}$	$\hat{\alpha}_{H}$	$\hat{\alpha}_{Q}$	$\hat{\alpha}_{M}$	$\hat{\alpha}_{tH}$	$\hat{\alpha}_{tlH}$	$\hat{\alpha}_{IPO}$
1.10	0.80	3		1.108	1.089	1.055	1.116	1.075	1.098	0.613
1.20	0.70	3		1.212	1.187	1.156	1.283	1.169	1.181	0.686
1.30	0.65	5		1.317	1.274	1.245	1.399	1.254	1.257	0.766
1.40	0.63	8		1.425	1.365	1.334	1.513	1.344	1.327	0.851
1.50	0.61	7		1.537	1.448	1.413	1.611	1.426	1.392	0.946
1.60	0.60	9		1.652	1.543	1.508	1.736	1.521	1.454	1.049
1.70	0.60	15		1.765	1.620	1.580	1.846	1.597	1.513	1.164
1.80	0.58	19		1.890	1.714	1.692	1.985	1.683	1.566	1.292
1.90	0.45	24		1.995	1.790	1.764	2.166	1.759	1.618	1.437
			AE	0.095	0.110	0.136	0.116	0.141	0.282	0.463

Similarly, from the table 2 and the figure 2 we can see that the estimating errors $\left|\hat{\alpha}^{\prime}-\alpha\right|$ of $\hat{\alpha}^{\prime}$ are smaller than that of other six estimators for $\alpha=1.20,1.30,1.40,1.50,1.60,1.70,1.90$ . Only for $\alpha=1.10$ and $\alpha=1.80$ , the estimating errors $\left|\hat{\alpha}^{\prime}-1.01\right|=0.008$ and $\left|\hat{\alpha}^{\prime}-1.80\right|=0.090$ of $\hat{\alpha}^{\prime}$ are larger than that of $\hat{\alpha}_{tlH}$ and $\hat{\alpha}_{H}$ respectively since $\left|\hat{\alpha}_{tlH}-1.10\right|=0.002$ and $\left|\hat{\alpha}_{H}-1.80\right|=0.086$ . The estimating error of $\hat{\alpha}_{IPO}$ is large than that of other six estimators for all $\alpha$ . Obviously, the average of estimating error AE (0.095) of $\hat{\alpha}^{\prime}$ is smallest among the seven estimators. That is, the estimator $\hat{\alpha}$ has the best performance in estimating $\alpha\in(1,2]$ among the seven estimators.

In short, the two truncated estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ have the best performance in estimating $\alpha$ ( $0<\alpha\leq 2$ ) among the seven estimators.

Remark 3.

The disadvantage of the two truncated estimators is that they need to know the value range of the unknown parameter $\alpha$ . If we don’t know whether $\alpha$ is included in interval $(0,1]$ or interval $(1,2]$ , we may take the initial values $\hat{\alpha}_{0}$ and $\hat{\alpha}^{\prime}_{0}$ according to the method in Remark 2

3.2. The rejection regions and the Type I Error

In order to get the Type I Error, we consider the rejected regions of these estimators except the IPO estimator since we do not know the asymptotic distribution of $\hat{\alpha}_{IPO}$ . Let $H_{0}$ and $H_{1}$ denote the original hypothesis and the alternative hypothesis, respectively, that is,

\displaystyle\text{Original hypothesis}\,\,H_{0}:\,\,\alpha=\alpha_{0},\,\,\,\,\,\,\,\,\,\,\,\text{ alternative hypothesis}\,\,H_{1}:\,\,\alpha\neq\alpha_{0},

where $0<\alpha_{0}<1$ or $1<\alpha_{0}<2$ . Let the confidence level be $0.95$ and $\beta=0$ in the Theorem 3. By using the inequalities (15) and (16) of the Theorem 3, we have

\displaystyle\textbf{P}\Big{(}\frac{\sqrt{n}\sqrt{\alpha_{0}(2-\alpha_{0})}|\hat{\alpha}-\alpha_{0}|\ln b_{n}}{(1-\alpha_{0})b_{n}^{\alpha_{0}/2}}\leq 1.96\Big{)}\approx 2\Phi(1.96)-1=0.95

for $0<\alpha_{0}<1$ and

\displaystyle\textbf{P}\Big{(}\frac{\sqrt{n}\sqrt{\alpha_{0}(4-\alpha_{0})}|\hat{\alpha}^{\prime}-\alpha_{0}|\ln b_{n}}{(2-\alpha_{0})b_{n}^{\alpha_{0}/2}}\leq 1.96\Big{)}\approx 2\Phi(1.96)-1=0.95

for $1<\alpha_{0}<2$ . Therefore, we can get two rejection regions $R_{T}$ and $R_{T}^{\prime}$ in the following

\displaystyle R_{T}=\{x:\frac{\sqrt{n}\sqrt{\alpha_{0}(2-\alpha_{0})}|x-\alpha_{0}|\ln b_{n}}{(1-\alpha_{0})b_{n}^{\alpha_{0}/2}}>1.96\}

for $0<\alpha_{0}<1$ and

\displaystyle R_{T}=\{x:\frac{\sqrt{n}\sqrt{\alpha_{0}(4-\alpha_{0})}|x-\alpha_{0}|\ln b_{n}}{(2-\alpha_{0})b_{n}^{\alpha_{0}/2}}>1.96\}

for $1<\alpha_{0}<2$ .

Since the five estimators, $\hat{\alpha}_{H}$ , $\hat{\alpha}_{Q}$ , $\hat{\alpha}_{M}$ , $\hat{\alpha}_{tH}$ and $\hat{\alpha}_{tlH}$ satisfy

\alpha_{0}\sqrt{m}\left(\hat{\alpha}_{H}^{-1}-\alpha_{0}^{-1}\right)\stackrel{{\scriptstyle d}}{{\longrightarrow}}N(0,1),

\alpha_{0}\sqrt{m/2}\left(\hat{\alpha}_{Q}^{-1}-\alpha_{0}^{-1}\right)\stackrel{{\scriptstyle d}}{{\longrightarrow}}N(0,1),

\frac{\alpha_{0}\sqrt{m}}{\sqrt{1+\alpha_{0}^{2}}}\left(\hat{\alpha}_{M}^{-1}-\alpha_{0}^{-1}\right)\stackrel{{\scriptstyle d}}{{\longrightarrow}}N(0,1),

\frac{\alpha_{0}\sqrt{\alpha_{0}(\alpha_{0}+2)}{\sqrt{m}}}{1+\alpha_{0}}\left(\hat{\alpha}_{tH}^{-1}-\alpha_{0}^{-1}\right)\stackrel{{\scriptstyle d}}{{\longrightarrow}}N(0,1),

and

\frac{\alpha_{0}{\sqrt{m}}}{2\sqrt{2}}\left(\hat{\alpha}_{tlH}^{-1}-\alpha_{0}^{-1}\right)\stackrel{{\scriptstyle d}}{{\longrightarrow}}N(0,1),

we can similarly get the five rejection regions $R_{H}$ , $R_{Q}$ , $R_{M}$ , $R_{tH}$ and $R_{tlH}$ in the following with the confidence level $0.95$ respectively

R_{H}=\left\{x:\alpha_{0}\sqrt{m}\left|x^{-1}-\alpha_{0}^{-1}\right|>1.96\right\},

R_{Q}=\left\{x:\alpha_{0}\sqrt{m/2}\left|x^{-1}-\alpha_{0}^{-1}\right|>1.96\right\},

R_{M}=\left\{x:\frac{\alpha_{0}\sqrt{m}}{\sqrt{1+\alpha_{0}^{2}}}\left|x^{-1}-\alpha_{0}^{-1}\right|>1.96\right\},

R_{tH}=\left\{x:\frac{\alpha_{0}\sqrt{\alpha_{0}(\alpha_{0}+2)}{\sqrt{m}}}{1+\alpha_{0}}\left|x^{-1}-\alpha_{0}^{-1}\right|>1.96\right\},

and

R_{tlH}=\left\{x:\frac{\alpha_{0}{\sqrt{m}}}{2\sqrt{2}}\left|x^{-1}-\alpha_{0}^{-1}\right|>1.96\right\}.

Similar to the Section 3.1, we first consider $\alpha_{0}\in(0,1)$ and set the initial value $\hat{\alpha}_{0}=0.5$ . Let $m=m(n)=\sqrt{n}=100$ for the four estimators $\hat{\alpha}_{H}$ , $\hat{\alpha}_{Q}$ , $\hat{\alpha}_{M}$ and $\hat{\alpha}_{tH}$ and set $m=m(n)={0.5*n}=5000$ for $\hat{\alpha}_{tlH}$ .

Table 3: . The Type I Error for different

\alpha_{0}\in(0,1)

Parameters			Type I Error
$\alpha_{0}$	q	k		$\hat{\alpha}$	$\hat{\alpha}_{H}$	$\hat{\alpha}_{Q}$	$\hat{\alpha}_{M}$	$\hat{\alpha}_{tH}$	$\hat{\alpha}_{tlH}$
0.10	2.00	11		0.038	0.058	0.074	0.053	0.171	0.009
0.20	2.00	6		0.052	0.061	0.073	0.050	0.106	0.006
0.30	2.00	5		0.046	0.062	0.071	0.052	0.107	0.017
0.40	1.80	4		0.050	0.055	0.065	0.057	0.069	0.041
0.50	1.70	4		0.055	0.050	0.074	0.042	0.076	0.068
0.60	1.50	5		0.055	0.061	0.068	0.059	0.088	0.095
0.70	1.30	7		0.046	0.050	0.067	0.054	0.069	0.053
0.80	1.20	9		0.026	0.053	0.072	0.047	0.071	0.051
0.90	1.10	12		0.098	0.053	0.082	0.070	0.074	0.020
			AT	0.052	0.056	0.072	0.054	0.092	0.040

Like the definition of the average of estimating error AE we can similarly define the average of Type I Error, $\mathbf{AT}=\sum{TypeIError}/9$ under the confidence level $\mathbf{0.95}$ . The closer the average ( $\mathbf{0.050}$ ) of the Type I Error, the better the estimator.

From the table 3 and the figure 3 we can see that the value AT of $\hat{\alpha}$ is closer to $0.050$ than that of the other five estimators. Thus, it could be said that the truncated estimator $\hat{\alpha}$ is better than other five estimators for estimating $\alpha_{0}\in(0,1)$ .

Next, we consider $\alpha_{0}\in(1,2)$ and set the initial value $\hat{\alpha}^{\prime}_{0}=1.5$ . Similar to $\alpha_{0}\in(0,1)$ , let $n=10000$ be the number of samples and the truncated sequence be $b_{n}=n^{q}$ in the truncated estimator $\hat{\alpha}^{\prime}$ , where $q$ is the index of $b_{n}$ satisfying $0<\alpha_{0}q<1$ .

Table 4: . The Type I Error for different

\alpha_{0}\in(1,2)

Parameters			Type I Error
$\alpha_{0}$	q	k		$\hat{\alpha}^{\prime}$	$\hat{\alpha}_{H}$	$\hat{\alpha}_{Q}$	$\hat{\alpha}_{M}$	$\hat{\alpha}_{TH}$	$\hat{\alpha}_{tlH}$
1.10	0.80	3		0.043	0.063	0.089	0.060	0.078	0.004
1.20	0.70	3		0.041	0.052	0.064	0.068	0.072	0.007
1.30	0.65	5		0.049	0.054	0.071	0.057	0.090	0.048
1.40	0.63	8		0.053	0.079	0.081	0.068	0.095	0.190
1.50	0.61	7		0.067	0.101	0.101	0.049	0.119	0.489
1.60	0.60	9		0.090	0.079	0.090	0.062	0.094	0.803
1.70	0.60	15		0.094	0.104	0.096	0.056	0.135	0.964
1.80	0.58	19		0.093	0.097	0.084	0.059	0.128	0.999
1.90	0.45	24		0.199	0.125	0.112	0.069	0.151	1.000
			AT	0.081	0.084	0.088	0.061	0.107	0.500

From the table 4 and the figure 4 we can see that the value $\mathbf{AT}=0.081$ of $\hat{\alpha}^{\prime}$ is closer to $0.050$ than that of other four estimators except the Moment estimator $\hat{\alpha}_{M}$ since the average Type I Error of $\hat{\alpha}_{M}$ is 0.061.

3.3. Power of estimator

In this section we consider the power of estimator, that is, the probability of correctly rejecting the original hypothesis under the confidence level $0.95$ . Consider two original hypothesises $\alpha_{0}=0.6$ and $\alpha_{0}=1.40$ respectively. Take $b_{n}=n^{1.5}$ , $n=10000$ and consider several different tail indices $\alpha^{*}=0.60,0.64,0.68,0.72,0.76,0.80,0.84,0.88,0.92$ , we can get the corresponding estimators $\hat{\alpha}^{*}$ , $\hat{\alpha}_{H}^{*},\hat{\alpha}_{Q}^{*},\hat{\alpha}_{M}^{*}$ , $\hat{\alpha}_{tH}^{*}$ and $\hat{\alpha}_{tlH}^{*}$ . We can similarly define the average power as $\mathbf{AP}=\sum{\hat{\alpha}_{P}^{*}}/9$ , where $\hat{\alpha}_{P}^{*}$ denotes the power of $\hat{\alpha}^{*}$ , $\hat{\alpha}_{H}^{*},\hat{\alpha}_{Q}^{*},\hat{\alpha}_{M}^{*}$ , $\hat{\alpha}_{tH}^{*}$ and $\hat{\alpha}_{tlH}^{*}$ .

Table 5: . The power for

\alpha_{0}=0.60

with different

\alpha^{*}

Parameters		Power
$\alpha^{*}$	k		$\hat{\alpha}^{*}$	$\hat{\alpha}_{H}^{*}$	$\hat{\alpha}_{Q}^{*}$	$\hat{\alpha}_{M}^{*}$	$\hat{\alpha}_{tH}^{*}$	$\hat{\alpha}_{tlH}^{*}$
$0.60$	5		0.055	0.056	0.062	0.049	0.064	0.076
$0.64$	5		0.137	0.067	0.029	0.086	0.049	0.777
$0.68$	6		0.552	0.194	0.037	0.215	0.083	0.999
$0.72$	7		0.841	0.330	0.108	0.342	0.147	1.000
$0.76$	8		0.967	0.536	0.179	0.509	0.269	1.000
$0.80$	5		0.990	0.760	0.295	0.657	0.440	1.000
$0.84$	9		0.995	0.875	0.407	0.772	0.604	1.000
$0.88$	7		1.000	0.945	0.558	0.893	0.709	1.000
$0.92$	9		1.000	0.985	0.628	0.925	0.835	1.000
		AP	0.726	0.528	0.256	0.494	0.356	0.872

The table 5 above and the following figure 5 illustrate the power and average power AP of six estimators, $\hat{\alpha}^{*}$ , $\hat{\alpha}_{H}^{*}$ , $\hat{\alpha}_{Q}^{*}$ , $\hat{\alpha}_{M}^{*}$ , $\hat{\alpha}_{tH}^{*}$ and $\hat{\alpha}_{tlH}^{*}$ . It can be seen that the average power AP of the truncated estimator $\hat{\alpha}^{*}$ is 0.726 which is larger than that of other four estimators except the t-lgHill estimator since the average power of $\hat{\alpha}_{tlH}^{*}$ is 0.872.

Next we consider $\alpha_{0}=1.40$ . It can be seen from the following table 6 and figure 6 that the power of $\hat{\alpha}^{*\prime}$ is larger than that of other five estimators respectively for $\alpha^{*\prime}=1.48,1.56,1.64,1.72$ , $1.80,1.88,1.96$ , and the average power AP (0.781) of $\hat{\alpha}^{*\prime}$ is the largest among all six estimators.

Table 6: . The power for

\alpha_{0}=1.40

with different

\alpha^{*\prime}

$\alpha^{*\prime}$	k		$\hat{\alpha}^{*\prime}$	$\hat{\alpha}_{H}^{*\prime}$	$\hat{\alpha}_{Q}^{*\prime}$	$\hat{\alpha}_{M}^{*\prime}$	$\hat{\alpha}_{tH}^{*\prime}$	$\hat{\alpha}_{tlH}^{*\prime}$
$1.40$	8		0.068	0.066	0.084	0.067	0.096	0.170
$1.48$	8		0.383	0.038	0.040	0.088	0.044	0.006
$1.56$	7		0.820	0.076	0.036	0.120	0.055	0.008
$1.64$	8		0.978	0.164	0.058	0.181	0.107	0.125
$1.72$	11		0.999	0.256	0.080	0.207	0.176	0.578
$1.80$	12		1.000	0.414	0.142	0.331	0.264	0.925
$1.88$	18		1.000	0.562	0.196	0.375	0.389	0.995
$1.96$	26		1.000	0.678	0.254	0.423	0.518	1.000
		AP	0.781	0.282	0.111	0.224	0.206	0.476

The second largest average powers of $\hat{\alpha}^{*}$ and the largest average powers of $\hat{\alpha}^{*\prime}$ respectively in table 5 and table 6 mean that the two truncated estimators have a more robust performance than that of other five estimators on the whole.

4. Conclusion

In order to make up for the shortcomings of existing estimation methods, we present a new method of truncated estimation to estimate the tail index of the extremely heavy-tailed distributions with infinite mean or variance. By using the truncated sample mean $\hat{\mu}_{n}$ , the truncated sample second order moment $\hat{\nu}_{n}^{2}$ and the two recursive estimators in equation (9) and (10), we can obtain the two truncated estimators $\hat{\alpha}$ and $\hat{\alpha}^{\prime}$ respectively for $\alpha\in(0,1]$ and $\alpha\in(1,2]$ . We not only give the rate of strong consistency convergence of the two truncated estimators, but also prove that their asymptotic distributions are normal. Moreover, among all six estimators, the numerical simulation results show that the two truncated estimators have the smallest average estimating error, the truncated estimator $\hat{\alpha}$ has the closest average (0.05) of Type I Error and the truncated estimator $\hat{\alpha}^{\prime}$ has the largest average power. In short, the performance of the two new truncated estimators is quite good on the whole.

Acknowledgments

The authors are grateful to the referees for their careful reading of this paper and valuable comments.

Declaration of interest statement

The authors declare that they have no conflict of interest.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

References

[1] A. L. M. Dekkers, J. H. J. Einmahl, L. De Haan. (1989). A Moment Estimator for the Index of an Extreme-Value Distribution. Ann. Statist. 17(4):1833-1855. doi:10.1214/aos/1176347397.
[2] Anderson, P. L., Meerschaert, M. M.(1998). Modeling river flows with heavy tails. Water Resources Research 34(9):2271-2280. doi:10.1029/98WR01449.
[3] Beirlant, Jan , Worms, J. , Worms, Rym. (2018). Estimation of the extreme value index in a censorship framework: Asymptotic and finite sample behavior. Journal of Statistical Planning and Inference 202:31-56. doi:10.1016/j.jspi.2019.01.004.
[4] Bladt, Martin, Albrecher, Hansjoerg, Beirlant, Jan.(2021). Trimmed extreme value estimators for censored heavy-tailed data. Electronic Journal of Statistics 15(1):3112-3136. doi:10.1214/21-EJS1857.
[5] Bowers, M. C., Tung, W. W., Gao, J. B. (2012). On the distributions of seasonal river flows: Lognormal or power law $?$ Water Resources Research 48(5):0043-1397. 10.1029/2011WR011308.
[6] Cooke, Roger , Nieboer, Daan , Misiewicz, Jolanta. (2014). Fat-Tailed Distributions: Data, Diagnostics and Dependence. ISTE Ltd and John Wiley $\&$ Sons,Inc. 10.1002/9781119054207.
[7] Fedotenkov, Igor, (2018). A review of more than one hundred Pareto-tail index estimators.Research Papers in Economics. University Library of Munich, Germany.
[8] Girard, S., Stupfler, G. and Carleve, A. U. (2020). An $L^{p}$ quantile methodology for tail index estimation. HAL Id: hal-02311609.
[9] Goegebeur, Y., Guillou, A., Qin, J. (2019). Bias-corrected estimation for conditional pareto-type distributions with random censoring. Extremes 22:459-C498. doi:10.1007/s10687-019-00341-7.
[10] Gomes, MI,, Guillou, A (2015). Extreme Value Theory and Statistics of Univariate Extremes: A Review.International Statistical Review 83:263– 292. doi:10.1111/insr.12058.
[11] Hill, B. M. . (1975). A simple approach to inference about the tail of a distribution. Ann. Statist. 3(5):1163-1174. doi: 10.1214/aos/1176343247.
[12] Jordanova,P.,Fabian,Z.,Hermann, P.,Strelec, L.,Rivera, A.,Girard, S.,Torres,S.,Stehlik, M.(2016). Weak properties and robustness of t-Hill estimators. Extremes 19:591-626. doi:10.1007/s10687-016-0256-2.
[13] Jordanova,P. K., Pancheva, E. I. . (2012). Weak asymptotic results for t-hill estimator. Comptes rendus de I’Académie bulgare des sciences: sciences mathématiques et naturelles, 65(12):1649-1656.
[14] Jordanova,P.,Stehlik, M. (2020). IPO estimation of heaviness of the distribution beyond regularly varying tails. Stochastic Analysis and Applications.38(1):76-96. doi:10.1080/07362994.2019.1647786.
[15] Kratz, M. and Resnick, S. I. (1996). The qq-estimator and heavy tails. Communications in Statistics, Stochastic Models. 12(4):699-724. doi:10.1080/15326349608807407.
[16] Mandelbrot, B.B. (1963). The Variation of Certain Speculative Prices. The Journal of Business 36:371-418. doi:10.1007/978-1-4757-2763-0_14.
[17] Marat Ibragimov, Rustam Ibragimov, Paul Kattuman. (2013). Emerging markets and heavy tails. Journal of Banking $\&$ Finance, 37(7):2546-2559. doi:10.1016/j.jbankfin.2013.02.019.
[18] Merz, B., Basso, S., Fischer, S., Lun, D., Blöschl, G., Merz, R., et al. (2022). Understanding heavy tails of flood peak distributions. Water Resources Research, 58(6):0043-1397. doi:10.1029/2021WR030506.
[19] Paulauskas, V. (2003). A new estimator for a tail index. Acta Applicandae Mathematicae. 79:55-67. doi:10.1023/A:1025818424104.
[20] Paulauskas, V., VaiIulis, M. (2011). Several modifications of dpr estimator of the tail index. Lithuanian Mathematical Journal. 51:36-50. doi:10.1007/s10986-011-9106-8.
[21] Resnick, S. I. (1989). Extreme values, regular variation, and point processes. Journal of the American Statistical Association. 84(407):845. doi:10.2307/2289692.
[22] Resnick, S. I. (2007). Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer Verlag. Springer Series in Operations Research and Financial Engineering.
[23] Worms, J. and Worms, R. (2018). Extreme value statistics for censored data with heavy tails under competing risks. Metrika 81:849-889. doi:10.1007/s00184-018-0662-3

Appendix: Proofs Theorems

Proof of Theorem 1. Let $H_{1}(x)=x-G_{1}(x)$ for $0<x\leq 1$ and $H_{2}(y)=y-G_{2}(y)$ for $0<y\leq 2$ . Let

	$\displaystyle 0=H^{\prime}_{1}(x)=1-G^{\prime}_{1}(x)=1-\frac{\hat{\mu}(b_{n})}{\ln b_{n}[x(1-x)\hat{\mu}(b_{n})+x^{2}]},\,\,\,\,0<x\leq 1,$
	$\displaystyle 0=H^{\prime}_{2}(y)=1-G^{\prime}_{2}(y)=1-\frac{2\hat{\nu}^{2}(b_{n})}{\ln b_{n}[y(2-y)\hat{\nu}^{2}(b_{n})+y^{2}]},\,\,\,\,0<y\leq 2.$

It follows that both $H^{\prime}_{1}(x)=0$ and $H^{\prime}_{2}(y)=0$ have two real roots, $x_{1,2}$ and $y_{1,2}$ , respectively, i.e.

\displaystyle x_{1,2}=\frac{1\pm\sqrt{1-4(1-1/\hat{\mu}(b_{n}))/\ln b_{n}}}{2(1-1/\hat{\mu}(b_{n}))},\,\,\,\,y_{1,2}=\frac{1\pm\sqrt{1-2(1-1/\hat{\nu}^{2}(b_{n}))/\ln b_{n}}}{(1-1/\hat{\nu}^{2}(b_{n}))}.

for large $n$ such that $\ln b_{n}\geq 4$ , $\hat{\mu}(b_{n})>1$ and $\hat{\nu}^{2}(b_{n})>1$ . Since

\displaystyle G^{\prime\prime}_{1}(x)=-\frac{\hat{\mu}(b_{n})[2x+(1-2x)\hat{\mu}(b_{n})]}{\log b_{n}[x(1-x)\hat{\mu}(b_{n})+x^{2}]^{2}},

it follows that $G^{\prime\prime}_{1}(x)<0$ for $0<x\leq 1/2$ and $G^{\prime\prime}_{1}(x)>0$ for $\hat{\mu}(b_{n})>2x/(2x-1)$ and $1/2\,<x<1$ . Hence, $H_{1}(x)=x-G_{1}(x)$ is monotonically increasing for $x_{1}<x<x_{2}$ since $G^{\prime}_{1}(x)>0$ for $0<x<1$ .

Let $\mu_{n}(x_{1})=x_{1}(1-x_{1})^{-1}(b_{n}^{1-x_{1}}-1)$ . Note that $x_{1}=(1+o(1))\ln b_{n}(1+\ln^{-1}b_{n})$ for large $n$ . By using $1=G^{\prime}_{1}(x_{1})$ , (5) and the probability of $(\hat{\mu}(b_{n})-\mu_{n}(x_{1}))/\mu_{n}(x_{1})$ in (13) of the theorem 2, we can get that

$\displaystyle H_{1}(x_{1})$	$\displaystyle=$	$\displaystyle x_{1}-1+\frac{\ln[\hat{\mu}(b_{n})/\ln b_{n}]-2\ln x_{1}}{\ln b_{n}}$
	$\displaystyle=$	$\displaystyle x_{1}-1+\frac{\ln[\mu_{n}(x_{1})/\ln b_{n}]-2\ln x_{1}+\ln[(\hat{\mu}(b_{n})-\mu_{n}(x_{1}))/\mu_{n}(x_{1})+1]}{\ln b_{n}}$
	$\displaystyle\leq$	$\displaystyle\frac{-\ln(1-x_{1})-\ln\ln b_{n}-\ln x_{1}-2/\sqrt{\ln b_{n}}}{\ln b_{n}}$
	$\displaystyle=$	$\displaystyle(1+o(1))\frac{2(x_{1}-\sqrt{x_{1}})}{\ln b_{n}}<0$

with high probability $1-n^{-2}$ for large $n$ . On the other hand, $H_{1}^{\prime}(x_{2})=0$ , $H_{1}(1)=1-G_{1}(1)=0$ and $H_{1}^{\prime}(1)=1-G^{\prime}_{1}(1)<0$ since $\hat{\mu}(b_{n})(\ln b_{n})^{-1}>1$ for large $n$ , it follows that $H_{1}(x_{2})>0$ for large $n$ . Thus, $H_{1}(x)=0$ has an unique root $\hat{\alpha}\in(x_{1},\,x_{2})$ such that $H(\hat{\alpha})=0$ , i.e. $\hat{\alpha}=G_{1}(\hat{\alpha})$ for large $n$ . Note that $(x_{1},\,\ x_{2})\to(0,\,\,1)$ as $n\to\infty$ , therefore, $H_{1}(x)=0$ has an unique root $\hat{\alpha}\in(0,\,1)$ for large $n$ .

Similarly, from

\displaystyle G^{\prime\prime}_{2}(y)=-\frac{4\hat{\nu}^{2}(b_{n})[y+(1-y)\hat{\nu}^{2}(b_{n})]}{\log b_{n}[y(2-y)\hat{\nu}^{2}(b_{n})+y^{2}]^{2}},

it follows that $G^{\prime\prime}_{2}(y)<0$ for $0<y\leq 1$ and $G^{\prime\prime}_{2}(y)>0$ for $\hat{\nu}^{2}(b_{n})>y/(y-1)$ and $1<y<2$ . Hence, $H_{2}(y)=y-G_{2}(y)$ is monotonically increasing for $y_{1}<y<y_{2}$ since $G^{\prime}_{2}(y)>0$ for $0<y<2$ .

Let $\varepsilon_{n}=\ln^{-1}b_{n}$ . By equation(6) and the probability of $(\hat{\nu}^{2}(b_{n})-\nu^{2}_{n})/\nu^{2}_{n}$ in (14) of the theorem 2, we can get that

$\displaystyle H_{2}(1+\varepsilon_{n})$	$\displaystyle=$	$\displaystyle\nu^{-2}_{n}-1+\frac{\ln[\frac{1-\varepsilon_{n}}{1+\varepsilon_{n}}\hat{\nu}^{2}_{n}+1]}{\ln b_{n}}$
	$\displaystyle=$	$\displaystyle\nu^{-2}_{n}-1+\frac{\ln\nu^{2}_{n}+\ln[\frac{1-\varepsilon_{n}}{1+\varepsilon_{n}}]+\ln[1+\frac{\hat{\nu}^{2}(b_{n})-\nu^{2}_{n}}{\nu^{2}_{n}}+\frac{1+\varepsilon_{n}}{\nu^{2}_{n}(1-\varepsilon_{n})}]}{\ln b_{n}}$
	$\displaystyle\leq$	$\displaystyle\frac{-\frac{2}{\sqrt{\ln b_{n}}}+\frac{1+\varepsilon_{n}}{\nu^{2}_{n}(1-\varepsilon_{n})}}{\ln b_{n}}<0$

with high probability $1-n^{-2}$ for large $n$ since $\nu^{2}_{n}=(1+\varepsilon_{n})(1-\varepsilon_{n})^{-1}(b_{n}^{1-\varepsilon_{n}}-1)$ . On the other hand, since $H_{2}^{\prime}(y_{2})=0$ , $H_{2}(2)=0$ and $H_{2}^{\prime}(2)<0$ , it follows that $H_{2}(y_{2})>0$ . Thus, $H_{2}(y)=0$ has an unique root $\hat{\alpha}^{\prime}\in(1+\varepsilon_{n},\,y_{2})$ such that $H(\hat{\alpha}^{\prime})=0$ , i.e. $\hat{\alpha}^{\prime}=G_{2}(\hat{\alpha}^{\prime})$ for large $n$ . Note that $(1+\varepsilon_{n},\,\ y_{2})\to(1,\,\,2)$ as $n\to\infty$ , therefore, $H_{2}(y)=0$ has an unique root $\hat{\alpha}\in(1,\,2)$ for large $n$ .

Note that the functions $H_{1}(x)$ ( $0<x<1$ ) is monotonically increasing for large $n$ . Let $0<\hat{\alpha}_{0}<\hat{\alpha}<1$ . Since $H_{1}(\hat{\alpha})=0$ , it follows $H_{1}(\hat{\alpha}_{0})<H_{1}(\hat{\alpha})=0$ and therefore, $\hat{\alpha}_{0}<G_{1}(\hat{\alpha}_{0})=\hat{\alpha}_{1}$ . Through step-by-step iteration, we can get $\hat{\alpha}_{k}\nearrow\hat{\alpha}$ . Furthermore, by (3), (5) and (9) we have

$\displaystyle\hat{\alpha}-\hat{\alpha}_{k}$	$\displaystyle=$	$\displaystyle\frac{1}{\ln b_{n}}\ln\Big{(}1+\frac{\hat{\mu}(b_{n})(\hat{\alpha}-\hat{\alpha}_{k-1})}{\hat{\alpha}_{k-1}(\hat{\alpha}+\hat{\mu}(b_{n}))(1-\hat{\alpha})}\Big{)}$
	$\displaystyle\leq$	$\displaystyle\frac{\hat{\mu}(b_{n})(\hat{\alpha}-\hat{\alpha}_{k-1})}{\hat{\alpha}_{k-1}[\hat{\alpha}+\hat{\mu}(b_{n})(1-\hat{\alpha})]\ln b_{n}}$
	$\displaystyle\leq$	$\displaystyle\frac{\hat{\alpha}-\hat{\alpha}_{k-1}}{\hat{\alpha}_{k-1}(1-\hat{\alpha})\ln b_{n}}$
	$\displaystyle\leq$	$\displaystyle\frac{\hat{\alpha}-\hat{\alpha}_{0}}{[\hat{\alpha}_{0}(1-\hat{\alpha})\ln b_{n}]^{k}}$

since $\ln(1+x)|\leq x$ for $x\geq 0$ . For $0<\hat{\alpha}<\hat{\alpha}_{0}<1$ , we can similarly get that $\hat{\alpha}_{k}\searrow\hat{\alpha}$ and

\displaystyle\hat{\alpha}_{k}-\hat{\alpha}\leq\frac{\hat{\alpha}_{0}-\hat{\alpha}}{[\hat{\alpha}(1-\hat{\alpha})\ln b_{n}]^{k}}.

Hence, (11) holds.

By the same method above we can get that $\hat{\alpha}^{\prime}_{k}\nearrow\hat{\alpha}^{\prime}$ for $1<\hat{\alpha}^{\prime}_{0}<\hat{\alpha}^{\prime}<2$ , $\hat{\alpha}^{\prime}_{k}\searrow\hat{\alpha}^{\prime}$ for $1<\hat{\alpha}^{\prime}<\hat{\alpha}^{\prime}_{0}<2$ and the inequality (12) holds since $H_{2}(y)$ ( $1<y<2$ ) is monotonically increasing for large $n$ and

	$\displaystyle\hat{\alpha}^{\prime}-\hat{\alpha}^{\prime}_{k}$	$\displaystyle=$	$\displaystyle\frac{1}{\ln b_{n}}\ln\Big{(}1+\frac{2\hat{\nu}^{2}(b_{n})(\hat{\alpha}^{\prime}-\hat{\alpha}^{\prime}_{k-1})}{\hat{\alpha}^{\prime}_{k-1}[\hat{\alpha}^{\prime}+\hat{\nu}^{2}(b_{n})(2-\hat{\alpha}^{\prime})]}\Big{)},\,\,\,\,\hat{\alpha}^{\prime}_{0}<\hat{\alpha}^{\prime}$
	$\displaystyle\hat{\alpha}^{\prime}_{k}-\hat{\alpha}^{\prime}$	$\displaystyle=$	$\displaystyle\frac{1}{\ln b_{n}}\ln\Big{(}1+\frac{2\hat{\nu}^{2}(b_{n})(\hat{\alpha}^{\prime}_{k-1}-\hat{\alpha}^{\prime})}{\hat{\alpha}^{\prime}[\hat{\alpha}^{\prime}_{k-1}+\hat{\nu}^{2}(b_{n})(2-\hat{\alpha}^{\prime}_{k-1})]}\Big{)},\,\,\,\,\hat{\alpha}^{\prime}<\hat{\alpha}^{\prime}_{0}.$

This completes the proof.

Proof of Theorem 2. Let $\varepsilon=2\mu_{n}/(n^{\beta}\sqrt{\ln b_{n}})$ . Note that $\textbf{P}(|X_{k}(b_{n})-\mu_{n}|>b_{n})=0,$ $b^{\alpha}_{n}\ln b_{n}\leq\alpha n^{1-2\beta}/\ln n$ , $2-\alpha\geq 2(1-\alpha)^{2}$ , $\mu_{n}=\alpha(1-\alpha)^{-1}(b_{n}^{1-\alpha}-1)$ and $Var(X_{1}(b_{n}))\leq\nu^{2}_{n}=\alpha(2-\alpha)^{-1}(b_{n}^{2-\alpha}-1)$ . By the Bernstein inequality, we can get that

$\displaystyle\textbf{P}\Big{(}\|\sum_{k=1}^{n}(X_{k}(b_{n})-\mu_{n})\|\geq n\varepsilon\Big{)}$	$\displaystyle\leq$	$\displaystyle 2\exp\{-\frac{n\varepsilon^{2}}{2Var(X_{1}(b_{n}))+2b_{n}\varepsilon/3}\}$
	$\displaystyle\leq$	$\displaystyle 2\exp\{-\frac{2n^{1-2\beta}\mu^{2}_{n}}{\nu^{2}_{n}\ln b_{n}+2b_{n}\mu_{n}\sqrt{\ln b_{n}}/3n^{\beta}}\}$
	$\displaystyle\leq$	$\displaystyle 2\exp\{-\frac{\alpha n^{1-2\beta}(2-\alpha)b_{n}^{2-2\alpha}}{(1-\alpha)^{2}b_{n}^{2-\alpha}\ln b_{n}}\}$
	$\displaystyle\leq$	$\displaystyle 2\exp\{-\frac{2\alpha n^{1-2\beta}}{b_{n}^{\alpha}\ln b_{n}}\}\leq\frac{2}{n^{2}}$

for $0<\alpha<1$ and large $n$ . The inequality above holds also for $\alpha=1$ since $\mu_{n}=\ln b_{n}$ and $\nu^{2}_{n}=b_{n}-1$ in this case. Hence, the inequality in (13) holds for $0<\alpha\leq 1$ . Note that

\displaystyle\frac{\sum_{k=1}^{n}\textbf{E}(|X_{k}(b_{n})-\mu_{n}|^{3})}{[nVar(X_{1}(b_{n}))]^{3/2}}=O\Big{(}\frac{b_{n}^{\alpha}}{n}\Big{)}^{1/2}\rightarrow 0.

It follows from the Lyapunov central limit theorem that

\frac{\sqrt{(2-\alpha)}\sqrt{n}\left(\hat{\mu}_{n}-\mu_{n}\right)}{\alpha b_{n}^{1-\alpha/2}}=(1+o(1))\frac{n\left(\hat{\mu}_{n}-\mu_{n}\right)}{\sqrt{n\operatorname{Var}\left(X_{1}\left(b_{n}\right)\right)}}\Rightarrow N(0,1)

for $0<\alpha\leq 1$ since $Var(X_{1}(b_{n}))=(1+o(1))\alpha(2-\alpha)^{-1}b_{n}^{2-\alpha}$ for large $n$ , i.e., (13) holds.

By the same method we can get (14) for $1<\alpha\leq 2$ since $Var(X^{2}_{1}(b_{n}))=(1+o(1))\alpha(4-\alpha)^{-1}b_{n}^{4-\alpha}$ for large $n$ . It completes the proof.

Proof of Theorem 3. It follows from (3) and (5) that

	$\displaystyle\alpha-\hat{\alpha}$	$\displaystyle=$	$\displaystyle\frac{1}{\ln b_{n}}\ln\Big{(}1+(1+o(1))\frac{\alpha-\hat{\alpha}}{\hat{\alpha}(1-\alpha)}+(1+o(1))\frac{\alpha(1-\hat{\alpha})}{\hat{\alpha}(1-\alpha)}\frac{(\hat{\mu}_{n}-\mu_{n})}{\mu_{n}}\Big{)}$
		$\displaystyle=$	$\displaystyle(1+o(1))\frac{1}{\ln b_{n}}\frac{(\hat{\mu}_{n}-\mu_{n})}{\mu_{n}}$

for $0<\alpha<1$ and large $n$ . Hence, by (13) of the theorem 2, we have

\displaystyle\textbf{P}\Big{(}|\hat{\alpha}-\alpha|\geq\frac{2}{n^{\beta}\ln b_{n}\sqrt{\ln b_{n}}}\Big{)}=\textbf{P}\Big{(}|\frac{\hat{\mu}_{n}-\mu_{n}}{\mu_{n}}|\geq\frac{2(1+o(1))}{n^{\beta}\sqrt{\ln b_{n}}}\Big{)}\leq\frac{2}{n^{2}}

for $0<\alpha<1$ and large $n$ . Furthermore, by $(\alpha-\hat{\alpha})\mu_{n}\ln b_{n}=(1+o(1))(\hat{\mu}_{n}-\mu_{n})$ and (13), we can get that

	$\displaystyle\frac{\sqrt{\alpha(2-\alpha)}}{1-\alpha}\frac{\sqrt{n}(\hat{\alpha}-\alpha)\ln b_{n}}{b_{n}^{\alpha/2}}(1+o(1))$	$\displaystyle=$	$\displaystyle\frac{\sqrt{n}(\alpha-\hat{\alpha})\mu_{n}\ln b_{n}}{\sqrt{Var(X_{1}(b_{n}))}}$
		$\displaystyle=$	$\displaystyle\frac{\sqrt{n}(1+o(1))(\hat{\mu}_{n}-\mu_{n})}{\sqrt{Var(X_{1}(b_{n}))}}\Rightarrow N(0,\,1)$

for $0<\alpha<1$ . That is, (15) is true.

Similarly, by (4) and (6) we have

	$\displaystyle\alpha-\hat{\alpha}^{\prime}$	$\displaystyle=$	$\displaystyle\frac{1}{\ln b_{n}}\ln\Big{(}1+(1+o(1))\frac{2(\alpha-\hat{\alpha}^{\prime})}{\hat{\alpha}^{\prime}(2-\alpha)}+(1+o(1))\frac{\alpha(2-\hat{\alpha}^{\prime})}{\hat{\alpha}^{\prime}(2-\alpha)}\frac{(\hat{\nu}^{2}_{n}-\nu^{2}_{n})}{\nu^{2}_{n}}\Big{)}$
		$\displaystyle=$	$\displaystyle(1+o(1))\frac{1}{\ln b_{n}}\frac{(\hat{\nu}^{2}_{n}-\nu^{2}_{n})}{\nu^{2}_{n}}$

for $1<\alpha<2$ and large $n$ . Thus, from (14) of the theorem 2 it follows that

\displaystyle\textbf{P}\Big{(}|\hat{\alpha}^{\prime}-\alpha|\geq\frac{2}{n^{\beta}\ln b_{n}\sqrt{\ln b_{n}}}\Big{)}=\textbf{P}\Big{(}|\frac{\hat{\nu}^{2}_{n}-\nu^{2}_{n}}{\nu^{2}_{n}}|\geq\frac{2(1+o(1))}{n^{\beta}\sqrt{\ln b_{n}}}\Big{)}\leq\frac{2}{n^{2}}

for $1<\alpha<2$ . By $(\alpha-\hat{\alpha}^{\prime})\nu^{2}_{n}\ln b_{n}=(1+o(1))(\hat{\nu}^{2}_{n}-\nu^{2}_{n})$ and (14), we can similarly obtain that

	$\displaystyle\frac{\sqrt{\alpha(4-\alpha)}}{2-\alpha}\frac{\sqrt{n}(\hat{\alpha}^{\prime}-\alpha)\ln b_{n}}{b_{n}^{\alpha/2}}(1+o(1))$	$\displaystyle=$	$\displaystyle\frac{\sqrt{n}(\alpha-\hat{\alpha}^{\prime})\nu^{2}_{n}\ln b_{n}}{\sqrt{Var(X^{2}_{1}(b_{n}))}}$
		$\displaystyle=$	$\displaystyle\frac{\sqrt{n}(1+o(1))(\hat{\nu}^{2}_{n}-\nu^{2}_{n})}{\sqrt{Var(X^{2}_{1}(b_{n}))}}\Rightarrow N(0,\,1)$

for $1<\alpha<2$ . This proves (16).

Let $\alpha=1$ and $b_{n}\leq n^{1-2\beta}/\ln n$ . Note that $Var(X_{1}(b_{n})=b_{n}-1-(\ln b_{n})^{2}$ . By (7) and the Bernstein inequality, we can get that

	$\displaystyle\textbf{P}\Big{(}\|\hat{\alpha}-1\|\geq\frac{2\sqrt{2}}{n^{\beta}\ln b_{n}}\Big{)}$	$\displaystyle=$	$\displaystyle\textbf{P}\Big{(}\|n(\hat{\mu}_{n}-\mu_{n})\|\geq 2\sqrt{2}n\Big{)}$
		$\displaystyle\leq$	$\displaystyle 2\exp\{-\frac{4n}{Var(X_{1}(b_{n}))+2b_{n}\sqrt{2}/3}\}\leq\frac{2}{n^{2}}$

for large $n$ . It follows from the Lyapunov central limit theorem that

\displaystyle\frac{\sqrt{n}(\hat{\alpha}-1)\ln b_{n}}{\sqrt{b_{n}}}=(1+o(1))\frac{\sqrt{n}(\hat{\mu}_{n}-\ln b_{n})}{\sqrt{Var(X_{1}(b_{n}))}}\Rightarrow N(0,\,1).

This proves (17).

Let $\alpha=2$ and $b^{2}_{n}\leq n^{1-\beta}/\ln n$ . By (8), the Bernstein inequality and the Lyapunov central limit theorem, we can similarly prove (18) since $Var(X^{2}_{1}(b_{n}))=(1+o(1))b_{n}^{2}$ for large $n$ . It completes the proof.

A new method for estimating the tail index using truncated sample sequence ∗

ABSTRACT\\

1. Introduction

2. Two truncated estimators

Theorem 1.

Remark 1.

Remark 2.

Theorem 2.

Theorem 3.

3. Numerical Simulations

3.1. The estimating error

Remark 3.

3.2. The rejection regions and the Type I Error

3.3. Power of estimator

4. Conclusion

Acknowledgments

Declaration of interest statement

Availability of data and materials

References

A new method for estimating the tail index using truncated sample sequence ^∗