This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Inequality for the variance of an asymmetric loss

Naoya Yamaguchi, Yuka Yamaguchi and Maiya Hori
Abstract

We assume that the forecast error follows a probability distribution which is symmetric and monotonically non-increasing on non-negative real numbers, and if there is a mismatch between observed and predicted value, then we suffer a loss. Under the assumptions, we solve a minimization problem with an asymmetric loss function. In addition, we give an inequality for the variance of the loss.

1 Introduction

Let y^\hat{y} be a predicted value of an observed value yy. In this paper, we make the assumptions (I) and (II):

  1. (I)

    The prediction error z:=y^yz:=\hat{y}-y is the realized value of a random variable ZZ, whose probability density function f(z)f(z) satisfies f(x)=f(x)f(x)=f(-x) for xx\in\mathbb{R} and f(x)f(y)f(x)\geq f(y) for 0xy0\leq x\leq y.

  2. (II)

    Let k1k_{1}, k2>0k_{2}\in\mathbb{R}_{>0}. If there is a mismatch between yy and y^\hat{y}, then we suffer a loss

    L(z):={k1z,z0,k2z,z<0.\displaystyle L(z):=\begin{cases}k_{1}z,&z\geq 0,\\ -k_{2}z,&z<0.\end{cases}

Under the assumptions (I) and (II), we solve the minimization problem for the expected value of L(Z+c)L(Z+c):

C=argminc{E[L(Z+c)]}.C=\arg{\min_{c}}\{\operatorname{{E}}[L(Z+c)]\}.

In addition, we give the following theorem.

Theorem 1.

We have

V[L(Z+C)]V[L(Z)],\operatorname{{V}}[L(Z+C)]\leq\operatorname{{V}}[L(Z)],

where equality holds only when C=0C=0; that is, when k1=k2k_{1}=k_{2}.

Theorem 1 is obtained by the following lemma.

Lemma 2.

Suppose that a probability density function f(t)f(t) is monotonically non-increasing on 0\mathbb{R}_{\geq 0} and satisfies 0f(t)𝑑t=12\int_{0}^{\infty}f(t)dt=\frac{1}{2}. Then, for any x0x\geq 0, we have

α(x):=40xf(t)𝑑txtf(t)𝑑tx2+2x(0xf(t)𝑑t)20.\displaystyle\alpha(x):=4\int_{0}^{x}f(t)dt\int_{x}^{\infty}tf(t)dt-\frac{x}{2}+2x\left(\int_{0}^{x}f(t)dt\right)^{2}\geq 0.

If f(t)f(t) is strictly decreasing, then α(x)>0\alpha(x)>0 holds for x>0x>0. Also, α(x)=0\alpha(x)=0 holds for x0x\geq 0 if and only if f(t)f(t) equals to the probability density function of a continuous uniform distribution on 0\mathbb{R}_{\geq 0}.

These results are a generalization of the results of [5]. The paper [5] made the assumptions (I’) and (II):

  1. (I’)

    The prediction error z:=y^yz:=\hat{y}-y is the realized value of a random variable ZZ, whose probability density function is a generalized Gaussian distribution function (see, e.g., [1], [2], and [3]) with mean zero

    f(z):=12abΓ(a)exp(|zb|1a),\displaystyle f(z):=\frac{1}{2ab\Gamma(a)}\exp{\left(-\left\lvert\frac{z}{b}\right\rvert^{\frac{1}{a}}\right)},

    where Γ(a)\Gamma(a) is the gamma function and a,b>0a,b>0.

Assumption (I) is weaker than (I’). Thus, we assume a more general situation than in [5]. In [5], under the assumptions (I’) and (II), the minimization problem for the expected value of L(Z+c)L(Z+c) is solved and the inequality V[L(Z+C)]V[L(Z)]\operatorname{{V}}[L(Z+C)]\leq\operatorname{{V}}[L(Z)] is obtained. This inequality is derived from the following inequality: For a,x>0a,x>0, we have

xaγ(a,x)2xaΓ(a)2+2γ(a,x)Γ(2a,x)>0,\displaystyle x^{a}\gamma(a,x)^{2}-x^{a}\Gamma(a)^{2}+2\gamma(a,x)\Gamma(2a,x)>0, (1)

where

Γ(a):=0+ta1et𝑑t,Γ(a,x)\displaystyle\Gamma(a):=\int_{0}^{+\infty}t^{a-1}e^{-t}dt,\quad\Gamma(a,x) :=x+ta1et𝑑t,γ(a,x):=0xta1et𝑑t.\displaystyle:=\int_{x}^{+\infty}t^{a-1}e^{-t}dt,\quad\gamma(a,x):=\int_{0}^{x}t^{a-1}e^{-t}dt.

Inequality (1) is the special case of Lemma 2 that f(z)f(z) is a generalized Gaussian distribution function.

Assumptions (I) and (II) have a background in the procurement from an electricity market. Suppose that we purchase electricity y^\hat{y} from an market, based on a forecast of the electricity yy that will be needed. This situation makes the assumption (I). If y^y>0\hat{y}-y>0, then there is a waste of procurement fee proportional to y^y\hat{y}-y. If yy^>0y-\hat{y}>0, then we are charged with a penalty proportional to yy^y-\hat{y}. This situation makes the assumption (II). For details, see [4].

2 Proof of results

For cc\in\mathbb{R}, let sgn(c):=1(c0)\operatorname{{sgn}}(c):=1\>(c\geq 0); 1(c<0)-1\>(c<0). From 0f(z)𝑑z=12\int_{0}^{\infty}f(z)dz=\frac{1}{2}, the expected value of L(Z+c)L(Z+c) and L(Z+c)2L(Z+c)^{2} are as follows: For any cc\in\mathbb{R},

E[L(Z+c)]\displaystyle\operatorname{{E}}[L(Z+c)] =(k1+k2)|c|zf(z)𝑑z+c(k1k2)2+|c|(k1+k2)0|c|f(z)𝑑z,\displaystyle=(k_{1}+k_{2})\int_{|c|}^{\infty}zf(z)dz+\frac{c(k_{1}-k_{2})}{2}+|c|(k_{1}+k_{2})\int_{0}^{|c|}f(z)dz,
E[L(Z+c)2]\displaystyle\operatorname{{E}}[L(Z+c)^{2}] =(k12+k22)0z2f(z)𝑑z+sgn(c)(k12k22)0|c|z2f(z)𝑑z\displaystyle=(k_{1}^{2}+k_{2}^{2})\int_{0}^{\infty}z^{2}f(z)dz+\operatorname{{sgn}}(c)(k_{1}^{2}-k_{2}^{2})\int_{0}^{|c|}z^{2}f(z)dz
+2c(k12k22)|c|zf(z)𝑑z+c2(k12+k22)2+c|c|(k12k22)0|c|f(z)𝑑z.\displaystyle\qquad+2c(k_{1}^{2}-k_{2}^{2})\int_{|c|}^{\infty}zf(z)dz+\frac{c^{2}(k_{1}^{2}+k_{2}^{2})}{2}+c|c|(k_{1}^{2}-k_{2}^{2})\int_{0}^{|c|}f(z)dz.

Therefore, the expected value and the variance of L(Z)L(Z) are as follows:

E[L(Z)]\displaystyle\operatorname{{E}}[L(Z)] =(k1+k2)0zf(z)𝑑z,\displaystyle=(k_{1}+k_{2})\int_{0}^{\infty}zf(z)dz,
V[L(Z)]\displaystyle\operatorname{{V}}[L(Z)] =(k12+k22)0z2f(z)𝑑z(k1+k2)2(0zf(z)𝑑z)2.\displaystyle=(k_{1}^{2}+k_{2}^{2})\int_{0}^{\infty}z^{2}f(z)dz-(k_{1}+k_{2})^{2}\left(\int_{0}^{\infty}zf(z)dz\right)^{2}.

We determine the value cc that gives the minimum value of E[L(Z+c)]\operatorname{{E}}[L(Z+c)]. From

ddcE[L(Z+c)]\displaystyle\frac{d}{dc}\operatorname{{E}}[L(Z+c)] =k1k22+sgn(c)(k1+k2)0|c|f(z)𝑑z,\displaystyle=\frac{k_{1}-k_{2}}{2}+\operatorname{{sgn}}(c)(k_{1}+k_{2})\int_{0}^{|c|}f(z)dz,
d2dc2E[L(Z+c)]\displaystyle\frac{d^{2}}{dc^{2}}\operatorname{{E}}[L(Z+c)] =(k1+k2)f(c)0,\displaystyle=(k_{1}+k_{2})f(c)\geq 0,

we can see that E[L(Z+c)]\operatorname{{E}}[L(Z+c)] has the minimum value at the zero point of ddcE[L(Z+c)]\frac{d}{dc}\operatorname{{E}}[L(Z+c)]. The zero point CC satisfies the following equation:

k1k22+sgn(C)(k1+k2)0|C|f(z)𝑑z=0.\frac{k_{1}-k_{2}}{2}+\operatorname{{sgn}}(C)(k_{1}+k_{2})\int_{0}^{|C|}f(z)dz=0.

From this, C=0C=0 if and only if k1=k2k_{1}=k_{2}. Also, we have

E[L(Z+C)]\displaystyle\operatorname{{E}}[L(Z+C)] =(k1+k2)|C|zf(z)𝑑z,\displaystyle=(k_{1}+k_{2})\int_{|C|}^{\infty}zf(z)dz,
V[L(Z+C)]\displaystyle\operatorname{{V}}[L(Z+C)] =(k12+k22)0z2f(z)𝑑z2(k1+k2)20|C|f(z)𝑑z0|C|z2f(z)𝑑z\displaystyle=(k_{1}^{2}+k_{2}^{2})\int_{0}^{\infty}z^{2}f(z)dz-2(k_{1}+k_{2})^{2}\int_{0}^{|C|}f(z)dz\int_{0}^{|C|}z^{2}f(z)dz
4|C|(k1+k2)20|C|f(z)𝑑z|C|zf(z)𝑑z+C2(k1+k2)24\displaystyle\qquad-4|C|(k_{1}+k_{2})^{2}\int_{0}^{|C|}f(z)dz\int_{|C|}^{\infty}zf(z)dz+\frac{C^{2}(k_{1}+k_{2})^{2}}{4}
(k1+k2)2(|C|zf(z)𝑑z)2C2(k1+k2)2(0|C|f(z)𝑑z)2.\displaystyle\qquad\qquad-(k_{1}+k_{2})^{2}\left(\int_{|C|}^{\infty}zf(z)dz\right)^{2}-C^{2}(k_{1}+k_{2})^{2}\left(\int_{0}^{|C|}f(z)dz\right)^{2}.

Let

β(x)\displaystyle\beta(x) :=(0zf(z)𝑑z)2+20xf(z)𝑑z0xz2f(z)𝑑z+4x0xf(z)𝑑zxzf(z)𝑑z\displaystyle:=-\left(\int_{0}^{\infty}zf(z)dz\right)^{2}+2\int_{0}^{x}f(z)dz\int_{0}^{x}z^{2}f(z)dz+4x\int_{0}^{x}f(z)dz\int_{x}^{\infty}zf(z)dz
x24+(xzf(z)𝑑z)2+x2(0xf(z)𝑑z)2.\displaystyle\qquad-\frac{x^{2}}{4}+\left(\int_{x}^{\infty}zf(z)dz\right)^{2}+x^{2}\left(\int_{0}^{x}f(z)dz\right)^{2}.

Then, V[L(Z)]V[L(Z+C)]=(k1+k2)2β(C)\operatorname{{V}}[L(Z)]-\operatorname{{V}}[L(Z+C)]=(k_{1}+k_{2})^{2}\beta(C) holds. From β(0)=0\beta(0)=0 and

ddxβ(x)\displaystyle\frac{d}{dx}\beta(x) =40xf(z)𝑑zxzf(z)𝑑zx2+2x(0xf(z)𝑑z)2\displaystyle=4\int_{0}^{x}f(z)dz\int_{x}^{\infty}zf(z)dz-\frac{x}{2}+2x\left(\int_{0}^{x}f(z)dz\right)^{2}
+2f(x)0xz2f(z)𝑑z+2xf(x)xzf(z)𝑑z,\displaystyle\qquad+2f(x)\int_{0}^{x}z^{2}f(z)dz+2xf(x)\int_{x}^{\infty}zf(z)dz,

if Lemma 2 is proved, then Theorem 1 is immediately obtained. We prove Lemma 2.

Proof of Lemma 2.

Take any x0x\geq 0. If f(x)=0f(x)=0, then α(x)=0x2+2x14=0\alpha(x)=0-\frac{x}{2}+2x\cdot\frac{1}{4}=0. Below, we consider the case that f(x)>0f(x)>0. Let γ:=0xf(t)𝑑t\gamma:=\int_{0}^{x}f(t)dt. For a function g=g(t)g=g(t) satisfying f(x)g(t)0f(x)\geq g(t)\geq 0 for xtx\leq t and γ+xg(t)𝑑t=12\gamma+\int_{x}^{\infty}g(t)dt=\frac{1}{2}, we define a functional S(g)S(g) by

S(g):=xtg(t)𝑑t.\displaystyle S(g):=\int_{x}^{\infty}tg(t)dt.

Regarding S(g)S(g) as a solid with the bottom surface area xg(t)𝑑t=12γ\int_{x}^{\infty}g(t)dt=\frac{1}{2}-\gamma (constant), we find that if we make g(t)g(t) as large as possible within the range where tt is small, then S(g)S(g) become smaller. Thus, the function gg that minimizes S(g)S(g) is g(t)=u(t)g(t)=u(t) defined by

u(t):={f(x),xtx+1f(x)(12γ),0,otherwise.\displaystyle u(t):=\begin{cases}f(x),&x\leq t\leq x+\frac{1}{f(x)}\left(\frac{1}{2}-\gamma\right),\\ 0,&\text{otherwise}.\end{cases}

From

S(u)=xtu(t)𝑑t=x(12γ)+12f(x)(γ2γ+14)\displaystyle S(u)=\int_{x}^{\infty}tu(t)dt=x\left(\frac{1}{2}-\gamma\right)+\frac{1}{2f(x)}\left(\gamma^{2}-\gamma+\frac{1}{4}\right)

and γxf(x)\gamma\geq xf(x), we have

α(x)\displaystyle\alpha(x) 4γS(u)x2+2xγ2\displaystyle\geq 4\gamma S(u)-\frac{x}{2}+2x\gamma^{2}
=4γ{x(12γ)+12f(x)(γ2γ+14)}x2+2xγ2\displaystyle=4\gamma\left\{x\left(\frac{1}{2}-\gamma\right)+\frac{1}{2f(x)}\left(\gamma^{2}-\gamma+\frac{1}{4}\right)\right\}-\frac{x}{2}+2x\gamma^{2}
2xγ4xγ2+2x(γ2γ+14)x2+2xγ2\displaystyle\geq 2x\gamma-4x\gamma^{2}+2x\left(\gamma^{2}-\gamma+\frac{1}{4}\right)-\frac{x}{2}+2x\gamma^{2}
=0.\displaystyle=0.

Also, from this, if f(t)f(t) is strictly decreasing, then α(x)>0\alpha(x)>0 holds for x>0x>0. In addition, f(t)f(t) is the function of the form

f(t)={12a,0ta,0,t>a\displaystyle f(t)=\begin{cases}\frac{1}{2a},&0\leq t\leq a,\\ 0,&t>a\end{cases}

if and only if α(x)=0\alpha(x)=0 holds for x0x\geq 0. ∎

References

  • [1] Alex Dytso, Ronit Bustin, H. Vincent Poor, and Shlomo Shamai. Analytical properties of generalized gaussian distributions. Journal of Statistical Distributions and Applications, 5(1):6, Dec 2018.
  • [2] Saralees Nadarajah. A generalized normal distribution. Journal of Applied Statistics, 32(7):685–694, 2005.
  • [3] Th. Subbotin. On the law of frequency of error. Recueil Mathématique, 31:296–301, 1923.
  • [4] Naoya Yamaguchi, Maiya Hori, and Yoshinari Ideguchi. Minimising the expectation value of the procurement cost in electricity markets based on the prediction error of energy consumption. Pac. J. Math. Ind., 10:Art. 4, 16, 2018.
  • [5] Naoya Yamaguchi, Yuka Yamaguchi, and Ryuei Nishii. Minimizing the expected value of the asymmetric loss function and an inequality for the variance of the loss. Journal of Applied Statistics, 48(13-15):2348–2368, 2021. PMID: 35707067.

Faculty of Education, University of Miyazaki, 1-1 Gakuen Kibanadai-nishi, Miyazaki 889-2192, Japan

Email address, Naoya Yamaguchi: [email protected]

Email address, Yuka Yamaguchi: [email protected]

General Education Center, Tottori University of Environmental Studies, 1-1-1 Wakabadai-kita, Tottori, 689-1111. Japan

Email address, Maiya Hori: [email protected]