Hidden Equations of Threshold Risk

Vladimir V. Ejov Jerzy A. Filar College of Science and Engineering, Flinders University, South Australia, Australia Centre for Applications in Natural Resource Mathematics, School of Mathematics and Physics,The University Of Queensland, Queensland, Australia Zhihao Qiao

Abstract

We consider the problem of sensitivity of threshold risk, defined as the probability of a function of a random variable falling below a specified threshold level $\delta>0.$ We demonstrate that for polynomial and rational functions of that random variable there exist at most finitely many risk critical points. The latter are those special values of the threshold parameter for which rate of change of risk is unbounded as $\delta$ approaches these threshold values. We characterize candidates for risk critical points as zeroes of either the resultant of a relevant $\delta-$ perturbed polynomial, or of its leading coefficient, or both. Thus the equations that need to be solved are themselves polynomial equations in $\delta$ that exploit the algebraic properties of the underlying polynomial or rational functions. We name these important equations as ”hidden equations of threshold risk”.

^†^†journal: ^label1^label1footnotetext: The authors gratefully acknowledge the ARC Discovery grant DP180101602 and many valuable discussions with members of the research team Y.Nazarathy, T.Taimre, H.Jansen and S.Streipert of that project.^label2^label2footnotetext: Key Words: Threshold Risk, Tail Probabilities, Polynomial Perturbations, Roots of Polynomials, Discriminant and Puiseux Series.

1 Introduction and Motivation

This paper is motivated by the dual notions of “tipping points” and “risk sensitivity” frequently arising in society’s interactions with the natural environment. On some level the problem is at least as old as the history of agriculture with farmers being concerned about rainfall falling below some acceptable level, or onset of frost; a prototypical tipping point for successful cultivation of certain crops.

More recently, concerns about adverse climate change induced global warming have focused on the level of such warming exceeding thresholds such as 1.5 or 2.0 degrees C, by the year 2030 or 2050. Similarly, in the area of sustainable management of fisheries, regulators often consider a fishery secure if the biomass of the harvested species does not fall below a certain percentage (e.g., $60\%$ ) of the virgin biomass. From the perspective of mathematical modelling of these concerns we first recognise two essential features:

1.

often the variable that we are most interested in (e.g. harvest yield, or fish stock) depends essentially on at least one random variable;
2.

the tipping point is, perhaps, most naturally represented as a “special value” of some parameter. In particular, a value such that if the variable of interest falls below (or above) that value, this is considered to be a “high risk” situation.

The use of quotation marks in that last point suggests that there is a need to make these phrases precise so as to be able to analyse them rigorously. In particular, there are already several alternative mathematical formulations of risk often stemming from actuarial science, finance or engineering. However, in this paper, we take the position that the simple threshold risk is both appropriate and already challenging in the context of the management of natural resources such as fisheries. Conceptually, this risk is modelled as a tail probability

P(h(\text{random variable})<\delta),

where $\delta$ is the threshold parameter, and $h(\cdot)$ is a given function.

At first sight, this formulation of risk may appear to correspond to a problem fully solved by mathematical statisticians and probabilists. In particular, it is a problem extensively studied in the context of extreme value theory (e.g., see [3]), financial mathematics (e.g., see [6], [7]) and large deviation theory (e.g., see [8]). These approaches focus primarily on asymptotic properties of tail probabilities of certain classes of distributions. However, our approach is essentially different in the sense that we explore the parametric sensitivity of the threshold risk induced by the algebraic form of the function of the random variable that is of interest.

Indeed, recent applied studies such as [2] and [4], indicate that the threshold risk may exhibit high sensitivity to the choice of model parameters, including the threshold parameter. The latter arose in two quite disparate contexts of hospital management and fishery population models. This leads us to a more formal definition and analysis of threshold risk and critical values of the threshold which are natural candidates for tipping points. This is taken up in the next section.

2 Risk sensitivity of threshold probability with polynomials of random variables

The most general one dimensional problem we will consider here is one where the random variable that is of main interest to us is actually a known rational function $h(X)$ of another random variable $X$ whose cumulative distribution function(cdf) $F(x)$ is also assumed to be known. In this paper we assume that $X$ is an absolutely continuous random variable, hence the density function $f(x)$ is well-defined (see also Remark 2 in Section 3). We begin the analysis with a simpler case where $h(X)=p(X),$ a known polynomial in $X.$

Definition 1 (Risk with one polynomial function).

Let $X$ be a random variable and $\delta\in\mathbb{R}$ and consider a polynomial $p(X)=p_{0}+p_{1}X+p_{2}X^{2}+\cdots+p_{n}X^{n}.$ The threshold risk probability is

R(\delta)=P\Big{(}p(X)<\delta\Big{)},

(1)

where $\delta$ is a real valued parameter denoting the threshold.

Of course, in some applications the inequality in (1) would be reversed. More generally, the threshold could be a multiple of another polynomial function $q(X)$ . In such a case, the threshold risk definition is extended as follows.

Definition 2 (Risk with two polynomial functions).

With the same quantities as in Definition 1 and $q(X)=q_{0}+q_{1}X+q_{2}X^{2}+\cdots+q_{m}X^{m}$

R(\delta)=P\Big{(}p(X)<\delta q(X)\Big{)}=P\Big{(}p(X)-\delta q(X)<0\Big{)}.

(2)

Note that in both (1) and (2), we could have defined a $\delta$ -perturbed polynomial $p_{\delta}(X)=p(X)-\delta q(X)$ and expressed the threshold risk as

R(\delta)=P\Big{(}p_{\delta}(X)<0\Big{)}.

Naturally, in the case of (1), $q(X)$ is identically equal to 1.

A change in risk as $\delta_{0}$ changes to $\delta_{1}$ will be measured by the ratio

S(\delta_{0},\delta)=\frac{|R(\delta_{0})-R(\delta)|}{|\delta_{0}-\delta|}\approx|R^{\prime}(\delta_{0})|,

(3)

if the derivative exists and $\delta$ is close to $\delta_{0}.$

Definition 3.

Threshold risk sensitivity is now defined as follows.

1.

The risk sensitivity at $\delta_{0}$ is defined as the absolute value of the derivative $R^{\prime}(\delta_{0})$ , if it exists.
2.

If $|R^{\prime}(\delta_{0})|$ is infinite or undefined, then $\delta_{0}$ is a candidate risk critical point.
3.

We say $\delta_{0}$ is a risk critical point if there does not exist a neighbourhood $\mathcal{N}$ of $\delta_{0}$ such that $S(\delta_{0},\delta)$ is uniformly bounded for all $\delta\in\mathcal{N}.$

This is related to (but not the same) as the hazard function used in demography and actuarial science. The latter considers the ratio of the probability density function at $\delta_{0}$ to the probability of exceeding that threshold.

To analyze the polynomial threshold risk in more detail it will be necessary to consider the real roots of the underlying polynomial. Let $r_{1}(\delta)\leq r_{2}(\delta)\ldots\leq r_{n_{1}}(\delta)$ be the real roots of $p_{\delta}(X)=0$ for $n_{1}\leq n$ . We can partition $\mathbb{R}$ into union of following intervals

	$\displaystyle I_{0}(\delta)$	$\displaystyle=\big{(}-\infty,r_{1}(\delta)\big{)},$
	$\displaystyle I_{1}(\delta)$	$\displaystyle=\big{[}r_{1}(\delta),r_{2}(\delta)\big{)},$
	$\displaystyle I_{2}(\delta)$	$\displaystyle=\big{[}r_{2}(\delta),r_{3}(\delta)\big{)},$
	$\displaystyle\vdots$
	$\displaystyle I_{n_{1}}(\delta)$	$\displaystyle=\big{[}r_{n_{1}}(\delta),\infty\big{)},$

where we observe that the sign of the polynomial $p_{\delta}(X)$ cannot change inside any of the intervals $I_{j}$ . Let $\mathcal{J}^{-}(\delta)=\{j|p_{\delta}(x)\leq 0,\;\text{if}\;\;x\in I_{j}\}$ . The threshold risk can now be expressed as

R(\delta)=\sum_{j\in\mathcal{J^{-}}(\delta)}R(I_{j}(\delta))=\sum_{j\in\mathcal{J^{-}}(\delta)}\int_{x\in I_{j}(\delta)}f(x)dx=\sum_{j\in\mathcal{J^{-}(\delta)}}[F(r_{j+1}(\delta))-F(r_{j}(\delta))],

(4)

where $F(x)=\int_{-\infty}^{x}f(u)du.$ Hence its derivative with respect to $\delta$ is given by

R^{\prime}(\delta)=\sum_{j\in\mathcal{J^{-}}(\delta)}[f(r_{j+1}(\delta))r_{j+1}^{\prime}(\delta)-f(r_{j}(\delta))r_{j}^{\prime}(\delta)],

(5)

whenever the derivatives of these roots at $\delta$ exist. Let $\text{Dis}(p_{\delta}(X))$ denote the discriminant of the polynomial $p_{\delta}(X).$

Remark 1: It will be seen that the discriminant is a polynomial in $\delta$ . This is what we refer to as ”the hidden polynomial”. Consider the equation

\text{Dis}(p_{\delta}(X))=0.

(

*

)

In the following proposition, we show that the hidden equation ( $*$ ‣ 2) is that of finding the roots of a polynomial in $\delta$ with finite order.

Proposition 1.

For the case where $p_{\delta}(X)=p(X)-\delta$ , the hidden polynomial $\text{Dis}(p_{\delta}(X))$ is a polynomial in $\delta$ of order not greater than $n-1$ .

Proof.

Using Lemma 6 in the Appendix, we have

\begin{split}&\text{Dis}(p_{\delta}(X))=\frac{(-1)^{n(n-1)/2}}{p_{n}}\text{Res}(p_{\delta}(X),p^{\prime}_{\delta}(X))\\ =&\frac{(-1)^{\frac{n(n-1)}{2}}}{p_{n}}\text{Det}\begin{bmatrix}p_{n}&p_{n-1}&\cdots&\cdots&-\delta&0&\cdots&0\\ 0&p_{n}&\cdots&\cdots&p_{1}&-\delta&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots&\vdots&\ddots&-\delta\\ np_{n}&(n-1)p_{n-1}&\cdots&\cdots&p_{1}&0&\cdots&0\\ 0&np_{n}&\cdots&\cdots&\cdots&p_{1}&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots&\vdots&\ddots&p_{1}\end{bmatrix}.\end{split}

(6)

Note that there are only $n-1$ columns containing $-\delta$ . Therefore, $\text{Dis}(p_{\delta}(X))$ is a polynomial in $\delta$ of order not greater than $n-1$ . ∎

The next theorem shows that zeroes of the discriminant play an important role in our analysis. In particular, the theorem will show these zeroes contain the candidates of risk critical points.

Theorem 1.

Let $\mathcal{Z}(p_{\delta}(X))=\{\delta\;|\text{Dis}(p_{\delta}(X))=0\}$ . It follows that

(i)

If the set $C$ of critical risk points is non-empty, then it is a subset of $\mathcal{Z}(p_{\delta}(X))$ ,
(ii)

There exist at most $n-1$ critical risk points, where $n$ is the degree of $p_{\delta}(X)$ .

Proof.

Proof of (i). We apply Theorem 3 in the Appendix. If $\delta_{0}\in\mathcal{Z}(P_{\delta}(X))$ , then for some $r_{j}(\delta_{0})$ , it has a root expansion with branching order $n^{\prime}\leq n$ , the root $r_{j}(\delta)$ has a Puiseux series representation

r_{j}(\delta)=\sum_{k=0}^{\infty}c_{jk}(\delta-\delta_{0})^{k/n^{\prime}},

and

r_{j}^{\prime}(\delta)=\sum_{k=1}^{\infty}c_{jk}\frac{k}{n^{\prime}}(\delta-\delta_{0})^{k/n^{\prime}-1}.

If $n^{\prime}>1$ , $k/n^{\prime}-1<0$ if $k<n^{\prime}$ , therefore $\lim_{\delta\rightarrow\delta_{0}}(\delta-\delta_{0})^{k/n^{\prime}-1}\rightarrow\infty$ . However, not all $\delta_{0}\in\mathcal{Z}(P_{\delta}(X))$ are risk critical points. For instance, $r_{j}(\delta_{0})$ could be a repeated root but not in the support of the distribution of $X$ .

Similarly, if $\delta_{0}\notin\mathcal{Z}(P_{\delta}(x))$ , then the root $r_{j}(\delta_{0})$ has an expansion with a power series

r_{j}(\delta)=\sum_{k=0}^{\infty}c_{jk}(\delta-\delta_{0})^{k}

and

r_{j}^{\prime}(\delta)=\sum_{k=1}^{\infty}kc_{jk}(\delta-\delta_{0})^{k-1}

and we have $\lim_{\delta\rightarrow\delta_{0}}r^{\prime}_{j}(\delta)<\infty.$ Hence $\delta_{0}$ is not a risk critical point. Therefore, $C$ is a subset of $\mathcal{Z}(p_{\delta}(x))$ .

Proof of (ii) follows immediately from Proposition 1. ∎

Example 1: Let $p_{\delta_{0}}(X)=X^{2}-\delta_{0}$ and $X\sim\mathcal{N}(0,1^{2})$ . Here $\text{Dis}(p_{\delta_{0}}(X))=4\delta_{0}$ , hence $\mathcal{Z}(p_{\delta_{0}}(X))=\{0\}.$ If $\delta_{0}=0$ , $r_{1}(\delta_{0})=0$ is a root with even multiplicity. By Theorem 1, it is a candidate risk critical point. If we perturb $\delta_{0}$ to $\delta=\delta_{0}+\varepsilon$ , $p_{\delta}(X)$ has roots $r_{1}(\delta)=-\sqrt{\delta},r_{2}(\delta)=\sqrt{\delta}$ and $r_{1}^{\prime}(\delta)=-\frac{1}{2\sqrt{\delta}},r_{2}^{\prime}(\delta)=\frac{1}{2\sqrt{\delta}}$ . Recalling the density function of standard normal distribution we see that equation (5) implies that the rate of change of the threshold risk is now given by $R^{\prime}(\delta)=\frac{1}{2\sqrt{\delta}}\frac{1}{\sqrt{2\pi}}e^{-0.5\delta}-(-\frac{1}{2\sqrt{\delta}}\frac{1}{\sqrt{2\pi}}e^{-0.5\delta})=\frac{1}{\sqrt{\delta}}\frac{1}{\sqrt{2\pi}}e^{-0.5\delta}$ . Clearly, the latter diverges as $\delta\to 0$ . It is now easy to see that $\delta_{0}=0$ is a risk critical point.

3 Repeating and non-repeating root decomposition of constant perturbation

In the previous section, we derived a theorem which identifies candidates for risk critical points. However, since the sensitivity of the risk also depends on the coefficients of the root expansion as well as the value of the density function, we need to further decompose the intervals in (4). In particular, we shall separate the contribution to the threshold risk from repeating and non-repeating roots.

Definition 4.

Let $\mathcal{J}^{-}(\delta)=J^{-}_{r}(\delta)\bigcup J^{-}_{n}(\delta)$ , where $J^{-}_{r}(\delta)$ and $J^{-}_{n}(\delta)$ are defined by

1.

$j\in J^{-}_{r}(\delta)$ if and only if $j\in\mathcal{J}^{-}(\delta)$ and $I_{j}(\delta)=(r_{j}(\delta),r_{j+1}(\delta))\in J^{-}(\delta)$ , and at least one of $r_{j}(\delta)$ and $r_{j+1}(\delta)$ is a repeated root of $p_{\delta}(X).$
2.

$j\in J^{-}_{n}(\delta)$ if and only if $j\in\mathcal{J}^{-}(\delta)$ and $I_{j}(\delta)=(r_{j}(\delta),r_{j+1}(\delta))\in J^{-}(\delta)$ , and both $r_{j}(\delta)$ and $r_{j+1}(\delta)$ are non-repeating roots of $p_{\delta}(X).$

Using the above definition, (4) can be partitioned as follows

R(\delta)=R_{r}(\delta)+R_{n}(\delta)=\sum_{j\in J^{-}_{r}(\delta)}R(I_{j}(\delta))+\sum_{j\in J^{-}_{n}(\delta)}R(I_{j}(\delta)).

(7)

By Theorem 1, critical points must be among the zeroes of the discriminant of $p_{\delta}(X)$ . However, the discriminant is zero only if the resultant is zero since the leading coefficient $p_{n}\neq 0$ . Now the resultant is zero only if $p_{\delta}(X)$ has repeated roots. If $p_{\delta}(X)$ has no repeated roots, then the hidden equation ( $*$ ‣ 2) cannot be satisfied. Therefore $J_{r}^{-}(\delta)$ is empty and $\delta$ is not a risk critical point.

Lemma 1.

For the non-repeated root component $R_{n}(\delta)$ , we have

\lim_{\delta\rightarrow\delta_{0}}\frac{R_{n}(\delta)-R_{n}(\delta_{0})}{|\delta-\delta_{0}|}=c,

for some finite scalar $c$ .

Proof.

We have from (7)

R_{n}(\delta_{0})=\sum_{j\in J^{-}_{n}(\delta_{0})}R(I_{j}(\delta_{0}))=\sum_{j\in J^{-}_{n}(\delta_{0})}F(r_{j+1}(\delta_{0}))-F(r_{j}(\delta_{0})).

(8)

Since non-repeating roots have multiplicity 1, they have a root expansion in a small neighbourhood of $\delta_{0}$ as $r_{j}(\delta)=\sum_{k=0}^{\infty}c_{jk}(\delta-\delta_{0})^{k}.$ Understanding that $c_{j0}=r_{j}(\delta_{0})$ , implies $r_{j}(\delta)-r_{j}(\delta_{0})=\sum_{k=1}^{\infty}c_{jk}(\delta-\delta_{0})^{k}$ . If we only consider one interval $I_{j}(\delta)=(r_{j}(\delta),r_{j+1}(\delta))\in J_{n}^{-}(\delta)$ , then

	$\displaystyle R(I_{j}(\delta))-R(I_{j}(\delta_{0}))$	$\displaystyle=\big{[}F(r_{j+1}(\delta))-F(r_{j+1}(\delta_{0}))\big{]}-\big{[}F(r_{j}(\delta))-F(r_{j}(\delta_{0}))\big{]}$
		$\displaystyle\approx f(r_{j+1}(\zeta))(r_{j+1}(\delta)-r_{j+1}(\delta_{0}))-f(r_{j}(\zeta))(r_{j}(\delta)-r_{j}(\delta_{0}))$
		$\displaystyle=f(r_{j+1}(\zeta))\sum_{k=1}^{\infty}c_{(j+1)k}(\delta-\delta_{0})^{k}-f(r_{j}(\zeta))\sum_{k=1}^{\infty}c_{jk}(\delta-\delta_{0})^{k},$

for some $\zeta$ between $\delta$ and $\delta_{0}$ . Hence, we have

\lim_{\delta\rightarrow\delta_{0}}\frac{R(I_{j}(\delta))-R(I_{j}(\delta_{0}))}{|\delta-\delta_{0}|}=f(r_{j+1}(\zeta))c_{(j+1)1}-f(r_{j}(\zeta))c_{j1},

which is bounded. Since this holds for every interval in $J_{n}^{-}(\delta)$ , the result follows from (8). ∎

The above lemma shows that in the case of a threshold $\delta_{0}$ such that the roots of polynomial $p_{\delta_{0}}(X)$ are all non-repeating, $\delta_{0}$ cannot be a risk critical point. However the case where the polynomial $p_{\delta_{0}}(X)$ has repeated roots is more complicated. Below, we first demonstrate the distinction between the case with the multiplicity of the repeated root being odd and even. Then we analyse the case where the interval $I_{j}(\delta)$ in $J_{r}^{-}(\delta)$ contains a repeated and a non-repeated root.

Lemma 2.

Consider the perturbation of the form $p_{\delta}(X)=p(X)-\delta$ , for $\delta-\delta_{0}>0$ and sufficiently close to $\delta_{0}$ , where $r_{j}(\delta_{0})$ is a repeated root of $p_{\delta_{0}}(X)=0$ . Then the order of the branching point of a repeated root is exactly 2 if the multiplicity of the root is even, and exactly 1 if the multiplicity of the root is odd.

Proof.

The graph of the polynomial $p_{\delta}(X)$ is simply the graph of $p_{\delta_{0}}(X)$ shifted down by $\delta-\delta_{0}$ . If the multiplicity of the root $r_{j}(\delta_{0})$ is odd, the polynomial $p_{\delta}(X)$ crosses the x-axis at the corresponding root $r_{j}(\delta)$ . Hence there is only one branch of that root. If on the other hand, the multiplicity of the root is even, the polynomial $p_{\delta_{0}}(X)$ touches x-axis at $r_{j}(\delta_{0})$ , but $p_{\delta}(X)$ will have two distinct roots: one to the left of $r_{j}(\delta_{0})$ and one to its right. Hence there are two branches of the root. ∎

Proposition 2.

Consider the perturbation of the form $p_{\delta}(X)=p(X)-\delta$ , for $\delta>\delta_{0}$ and sufficiently close to $\delta_{0}$ . Suppose $r_{j}(\delta_{0})$ is the only repeated root of $p_{\delta_{0}}(X)$ and $f(r_{j}(\delta_{0}))>0$ . We have the following cases,

(i)

If $r_{j}(\delta_{0})$ has odd multiplicity, then it is not a risk critical point,
(ii)

If $r_{j}(\delta_{0})$ has even multiplicity and $p_{\delta_{0}}(X)>0$ in a deleted neighbourhood of $r_{j}(\delta_{0})$ , then $\delta_{0}$ is risk critical point.
(iii)

If $r_{j}(\delta_{0})$ has even multiplicity and $p_{\delta_{0}}(X)<0$ in a deleted neighbourhood of $r_{j}(\delta_{0})$ , then $\delta_{0}$ is not a risk critical point.

Proof.

(i) If $r_{j}(\delta_{0})$ has odd multiplicity, by Lemma 2, the order of the branching point of $r_{j}(\delta_{0})$ is $n^{\prime}=1$ . Therefore, by Theorem 3(b) of the Appendix, it has the root expansion in a small neighbourhood of $\delta_{0}$ as $r_{j}(\delta)=\sum_{k=0}^{\infty}c_{jk}(\delta-\delta_{0})^{k}.$ Understanding that $c_{j0}=r_{j}(\delta_{0})$ , it follows that $r_{j}(\delta)-r_{j}(\delta_{0})=\sum_{k=1}^{\infty}c_{jk}(\delta-\delta_{0})^{k}$ , and without loss of generality, we have that $r_{j+1}(\delta)$ is a non-repeated root of $p_{\delta}(X)=0$ . It now follows that

\lim_{\delta\rightarrow\delta_{0}}\frac{R(I_{j}(\delta))-R(I_{j}(\delta_{0}))}{|\delta-\delta_{0}|}=f(r_{j+1}(\zeta))c_{(j+1)1}-f(r_{j}(\zeta))c_{j1},

for some $\zeta$ between $\delta$ and $\delta_{0}$ . The right hand side of the above is bounded. Similarly for contributions to $R(\delta)$ for all other intervals $I_{k}(\delta)$ where $k\neq j$ . Hence $\delta_{0}$ is not a risk critical point.

(ii) Similarly, by Lemma 2, if $r_{j}(\delta_{0})$ has even multiplicity and $p_{\delta_{0}}(X)>0$ in a deleted neighbourhood of $r_{j}(\delta_{0})$ , the order of branching point is $n^{\prime}=2$ . Therefore, by Theorem 3(b) of the Appendix, there are two root expansions in a small neighbourhood of $\delta_{0}$ namely $r_{j}(\delta)=\sum_{k=0}^{\infty}c_{jk}(\delta-\delta_{0})^{k/2}$ and $r_{j+1}(\delta)=\sum_{k=0}^{\infty}c_{(j+1)k}(\delta-\delta_{0})^{k/2}$ . It follows that $c_{j0}=c_{(j+1)0}=r_{j}(\delta_{0})$ . Next

	$\displaystyle R(I_{j}(\delta))-R(I_{j}(\delta_{0}))$	$\displaystyle=\big{[}F(r_{j+1}(\delta))-F(r_{j+1}(\delta_{0}))\big{]}-\big{[}F(r_{j}(\delta))-F(r_{j}(\delta_{0}))\big{]}$
		$\displaystyle\approx f(r_{j+1}(\zeta))(r_{j+1}(\delta)-r_{j+1}(\delta_{0}))-f(r_{j}(\zeta))(r_{j}(\delta)-r_{j}(\delta_{0}))$
		$\displaystyle=f(r_{j+1}(\zeta))\sum_{k=1}^{\infty}c_{(j+1)k}(\delta-\delta_{0})^{k/2}-f(r_{j}(\zeta))\sum_{k=1}^{\infty}c_{jk}(\delta-\delta_{0})^{k/2},$

for some $\zeta$ between $\delta$ and $\delta_{0}$ . If we consider $\varepsilon=\delta-\delta_{0}>0$ , we have

\begin{split}&\lim_{\delta\rightarrow\delta_{0}}\frac{R(I_{j}(\delta))-R(I_{j}(\delta_{0}))}{|\delta-\delta_{0}|}\\ &\approx\lim_{\varepsilon\rightarrow 0}\{f(r_{j+1}(\zeta))\big{[}c_{(j+1)1}\varepsilon^{-1/2}+c_{(j+1)2}+c_{(j+1)(3)}\varepsilon^{3/2-1}+\ldots\big{]}\\ &-f(r_{j}(\zeta))\big{[}c_{j1}\varepsilon^{-1/2}+c_{j2}+c_{j3}\varepsilon^{3/2-1}+\ldots\big{]}\},\\ &\approx\lim_{\varepsilon\rightarrow 0}\{f(r_{j+1}(\zeta))\big{[}c_{(j+1)1}\varepsilon^{-1/2}+c_{(j+1)2}\big{]}-f(r_{j}(\zeta))\big{[}c_{j1}\varepsilon^{-1/2}+c_{j2}\big{]}\}.\end{split}

(9)

Since $r_{j}(\delta_{0})$ has even multiplicity, for $\delta>\delta_{0}$ and sufficiently close to $\delta_{0}$ , one of the roots, say $r_{j+1}(\delta)$ is bigger than $r_{j}(\delta_{0})$ , and the other one is smaller. Hence if $r_{j+1}(\delta)>r_{j}(\delta)$ , then $c_{(j+1)1}>0$ and $c_{j1}<0$ . Therefore $f(r_{j+1}(\zeta))c_{(j+1)1}-f(r_{j}(\zeta))c_{j1}>0$ , and the above limit diverges as $\varepsilon\rightarrow 0$ . Note that contributions to $R(\delta)$ for all other intervals $I_{k}(\delta)$ where $k\neq j$ are constant as in (i). Hence $\delta_{0}$ is a risk critical point.

(iii) If $r_{j}(\delta_{0})$ has even multiplicity and $p_{\delta_{0}}(X)<0$ in a deleted neighbourhood of $r_{j}(\delta_{0})$ , then $r_{j}(\delta_{0})$ is a local maximum of $p_{\delta_{0}}(X)$ . Hence, the interval $I_{j}(\delta_{0})$ is of measure $0$ and contributes nothing to $R(I_{j}(\delta_{0}))$ . When the graph of the polynomial is shifted down by $\varepsilon=\delta-\delta_{0},$ there is no longer a root in a sufficiently small neighbourhood of $r_{j}(\delta_{0})$ . Hence, again, there is no contribution to $R(\delta)$ from $R(I_{j}(\delta))$ . Thus $\delta_{0}$ is not a risk critical point. ∎

Corollary 1.

Assume conditions of Proposition 2 (ii) apply to two or more distinct repeated roots $r_{1}(\delta_{0}),\ldots,r_{l}(\delta_{0})$ , with even multiplicity. If the density function $f(r_{j}(\delta_{0}))>0$ for at least one of these roots, then $\delta_{0}$ is a risk critical point.

Proof.

This is a generalization of Proposition 2. Since these even multiplicity roots are local minima, they contribute to the threshold risk sensitivity, in the limit as $\delta\to\delta_{0}$ , as in the proof of part (ii) of Proposition 2. Furthermore, there is no such contribution from any other roots.

∎

Remark 2: Note that the assumption that $X$ was an absolutely continuous random variable could be easily relaxed, but at the cost of more complicated notation and some additional technicalities. For instance, if $\tilde{x}$ is a discontinuity of the cdf $F(x)$ and $\delta_{0}$ is a threshold such that the $j^{th}$ root $r_{j}(\delta_{0})=\tilde{x},$ then $\delta_{0}$ is a candidate for a risk critical point.

4 Rational Function

Next we analyze the situation when the underlying function of the random variable $X$ is a rational function, namely, a ratio of two polynomials. Indeed, this was the case in the motivating study [4].

Definition 5 (Risk with rational function).

Let $X$ be a random variable and $\delta\in\mathbb{R}$ . Let $p(X)=p_{0}+p_{1}X+\ldots+p_{n}X^{n}$ and $q(X)=q_{0}+q_{1}X+\ldots q_{m}X^{m}$ be two co-prime polynomials. Let $h(X)=\frac{p(X)}{q(X)}$ and consider the threshold risk

\begin{split}R(\delta)&=P\bigg{(}h(X)<\delta\bigg{)}\\ &=P\big{(}p(X)-\delta q(X)<0|q(X)>0\big{)}P\big{(}q(X)>0\big{)}\\ &+P\big{(}p(X)-\delta q(X)>0|q(X)<0\big{)}P\big{(}q(X)<0\big{)}.\end{split}

(10)

The risk sensitivity is defined as before in Definition 2. In this rational function case the roots of the denominator $q(X)$ will impact (10). To compute the threshold risk, let $\tilde{r}_{1}\leq\tilde{r}_{2}\leq\ldots\leq\tilde{r}_{m^{\prime}}$ be the real roots of polynomial $q(X)$ where $m^{\prime}\leq m$ . We can factor $q(X)$ as

q(X)=\bigg{[}\prod_{d=1}^{m^{\prime}}\big{(}X-\tilde{r}_{d}\big{)}\bigg{]}\tilde{q}(X).

Now we can partition $\mathbb{R}$ by these roots as

\mathbb{R}=\bigcup_{j=1}^{m^{\prime}+1}\tilde{I}_{j}=(-\infty,\tilde{r}_{1})\cup[\tilde{r}_{1},\tilde{r}_{2})\cdots\cup[\tilde{r}_{m^{\prime}},\infty).

(11)

Some of the intervals can have zero length if there are repeated roots for $q(X)$ . Next, we define the event of interest as

E=\bigg{\{}x\;\bigg{|}\frac{p(x)}{q(x)}\leq\delta\bigg{\}}=\bigcup_{j=1}^{m^{\prime}+1}E_{j},

(12)

where $E_{j}=E\cap\tilde{I}_{j}$ . Then we partition the index set $J=J^{+}\cup J^{-}=\{1,2,\ldots,m^{\prime}\}$ as

	$\displaystyle J^{+}$	$\displaystyle=\big{\{}j\;\|q(x)>0\;\;\;\forall x\in\tilde{I_{j}}\big{\}},$
	$\displaystyle J^{-}$	$\displaystyle=\big{\{}j\;\|q(x)<0\;\;\;\forall x\in\tilde{I_{j}}\big{\}}.$

The threshold risk probability can be computed as

R(\delta)=P(E)=\sum_{j\in J^{+}}P(E_{j})+\sum_{j\in J^{-}}P(E_{j}).

(13)

For the polynomial function $h_{\delta}(X)=p(X)-\delta q(X)$ , we can express the events $E_{j}$ in (13) more explicitly conditioned on the sign of $h_{\delta}(X)$

\begin{split}&\text{if}\;\;j\in J^{+},\;E_{j}=\big{\{}x\in\tilde{I}_{j}\big{\}}\cap\big{\{}x|h_{\delta}(x)\leq 0\big{\}},\\ &\text{if}\;\;j\in J^{-},\;E_{j}=\big{\{}x\in\tilde{I}_{j}\big{\}}\cap\big{\{}x|h_{\delta}(x)>0\big{\}}.\end{split}

(14)

Let $r_{1}(\delta)\leq r_{2}(\delta)\leq\cdots\leq r_{n}^{\prime}(\delta)$ , where $n^{\prime}\leq n$ , be the real roots of this polynomial $h_{\delta}(X)$ . We can factor this polynomial as

h_{\delta}(X)=\bigg{[}\prod_{k=1}^{n^{\prime}}\big{(}X-r_{k}(\delta)\big{)}\bigg{]}\tilde{h}(X),

and similarly, we can partition $\mathbb{R}$ by these roots as

\mathbb{R}=\bigcup_{k=1}^{n^{\prime}+1}I_{k}(\delta)=(-\infty,r_{1}(\delta))\cup[r_{1}(\delta),r_{2}(\delta))\cdots\cup[r_{n^{\prime}}(\delta),\infty).

(15)

Using (14) and (15), we can sub-divide each $E_{j}$ into parts intersecting with the intervals $I_{k}(\delta)$ , by defining

E_{jk}(\delta)=E_{j}\cap I_{k}(\delta)=E\cap\tilde{I}_{j}\cap I_{k}(\delta).

(16)

There are two cases to consider

\tilde{I}_{j}\cap I_{k}(\delta)=I_{jk}(\delta)=\begin{cases}\bigg{[}\tilde{r}_{j-1}\vee r_{k-1}(\delta),\tilde{r}_{j}\wedge r_{k}(\delta)\bigg{)},\;\;\;\text{if}\;\;\tilde{I}_{j}\cap I_{k}(\delta)\neq\emptyset,\\ \\ \emptyset,\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\text{if}\;\;\tilde{I}_{j}\cap I_{k}(\delta)=\emptyset,\end{cases}

(17)

where $\tilde{r}_{j-1}\vee r_{k-1}(\delta)=\text{max}\big{(}\tilde{r}_{j-1},r_{k-1}(\delta)\big{)}$ and $\tilde{r}_{j}\wedge r_{k}(\delta)=\text{min}\big{(}\tilde{r}_{j},r_{k}(\delta)\big{)}.$ It is important to note that the sign of $h_{\delta}(X)$ remains constant on each interval $I_{k}(\delta)$ and hence also on $I_{jk}$ . If $j\in J^{+}$

E_{jk}(\delta)=\begin{cases}I_{jk}(\delta)\;\;\;\text{if}\;\;h_{\delta}(X)\leq 0\;\text{on}\;I_{k}(\delta),\\ \\ \emptyset\;\;\;\;\;\;\;\;\text{otherwise}.\end{cases}

(18)

If $j\in J^{-}$

E_{jk}(\delta)=\begin{cases}I_{jk}(\delta)\;\;\;\text{if}\;\;h_{\delta}(X)>0\;\text{on}\;I_{k}(\delta),\\ \\ \emptyset\;\;\text{if}\;\;\;\text{otherwise}.\end{cases}

(19)

Hence, we can refine (13) and compute the threshold risk probability for the rational function $h(X)$ as

R(\delta)=\sum_{j\in J^{+}}\sum_{k=1}^{n^{\prime}+1}P(I_{jk}(\delta))+\sum_{j\in J^{-}}\sum_{k=1}^{n^{\prime}+1}P(I_{jk}(\delta)).

(20)

Let

	$\displaystyle K^{-}$	$\displaystyle=\bigg{\{}k\;\|\;\;h_{\delta}(X)\leq 0\;\;\text{on}\;\;I_{k}(\delta)\bigg{\}}$
	$\displaystyle K^{+}$	$\displaystyle=\bigg{\{}k\;\|\;\;h_{\delta}(X)>0\;\;\text{on}\;\;I_{k}(\delta)\bigg{\}}.$

Substituting (18) and (19) into (20) and rearranging, we obtain the risk probability

\begin{split}R(\delta)&=\sum_{j\in J^{+}}\sum_{k\in K^{-}}P(E_{jk}(\delta))+\sum_{j\in J^{-}}\sum_{k\in K^{+}}P(E_{jk}(\delta)).\\ &=\sum_{j\in J^{+}}\sum_{k\in K^{-}}\bigg{[}F\big{(}\tilde{r}_{j}\wedge r_{k}(\delta)\big{)}-F\big{(}\tilde{r}_{j-1}\vee r_{k-1}(\delta)\big{)}\bigg{]}\\ &+\sum_{j\in J^{-}}\sum_{k\in K^{+}}\bigg{[}F\big{(}\tilde{r}_{j}\wedge r_{k}(\delta)\big{)}-F\big{(}\tilde{r}_{j-1}\vee r_{k-1}(\delta)\big{)}\bigg{]},\end{split}

(21)

and its derivative, whenever it exists.

\begin{split}R^{\prime}(\delta)&=\sum_{j\in J^{+}}\sum_{k\in K^{-}}\bigg{[}f\big{(}\tilde{r}_{j}\wedge r_{k}(\delta)\big{)}\frac{d}{d\delta}\big{(}\tilde{r}_{j}\wedge r_{k}(\delta)\big{)}-f\big{(}\tilde{r}_{j-1}\vee r_{k-1}(\delta)\big{)}\frac{d}{d\delta}\big{(}\tilde{r}_{j-1}\vee r_{k-1}(\delta)\big{)}\ \bigg{]}\\ &+\sum_{j\in J^{-}}\sum_{k\in K^{+}}\bigg{[}f\big{(}\tilde{r}_{j}\wedge r_{k}(\delta)\big{)}\frac{d}{d\delta}\big{(}\tilde{r}_{j}\wedge r_{k}(\delta)\big{)}-f\big{(}\tilde{r}_{j-1}\vee r_{k-1}(\delta)\big{)}\frac{d}{d\delta}\big{(}\tilde{r}_{j-1}\vee r_{k-1}(\delta)\big{)}\bigg{]}.\end{split}

(22)

Equations (21)-(22) are analogous to (4)-(5) in the one polynomial case. Naturally, they reflect an additional degree of complexity arising from the possibility of overlaps between intervals $\tilde{I}_{j}$ and $I_{k}(\delta)$ . While this complexity is unavoidable, one case where there are no difficulties is described in the next lemma.

Lemma 3.

Let $\mathcal{Z}(h_{\delta}(X))=\{\delta\;|\text{Dis}(h_{\delta}(X))=0\}$ and consider the intervals $\tilde{I}_{j}$ in (11) and $I_{k}(\delta)$ in (15). Then $\delta$ is not a candidate for risk critical point if each interval $\tilde{I}_{j}$ for $j\in J^{+}$ is contained in some interval $I_{k}$ for $k\in K^{-}$ or similarly if each interval $\tilde{I}_{j}$ for $j\in J^{-}$ is contained in some interval $I_{k}(\delta)$ for $k\in K^{+}$ .

Proof.

Under the interval inclusion hypotheses it can be easily verified that for every $j,$ $\tilde{r}_{j}\wedge r_{k}(\delta)=\tilde{r}_{j}$ and $\tilde{r}_{j-1}\vee r_{k-1}(\delta)=\tilde{r}_{j-1}$ . Hence equation (22) is well-defined and reduces to

\begin{split}R^{\prime}(\delta)&=\sum_{j\in J^{+}}\sum_{k\in K^{-}}\bigg{[}f\big{(}\tilde{r}_{j}\big{)}\frac{d}{d\delta}\big{(}\tilde{r}_{j}\big{)}-f\big{(}\tilde{r}_{j-1}\big{)}\frac{d}{d\delta}\big{(}\tilde{r}_{j-1}\big{)}\ \bigg{]}\\ &+\sum_{j\in J^{-}}\sum_{k\in K^{+}}\bigg{[}f\big{(}\tilde{r}_{j}\big{)}\frac{d}{d\delta}\big{(}\tilde{r}_{j}\big{)}-f\big{(}\tilde{r}_{j-1}\big{)}\frac{d}{d\delta}\big{(}\tilde{r}_{j-1}\big{)}\bigg{]}\\ &=0,\\ \end{split}

since none of the terms depend on $\delta$ . Thus the derivative of the risk is $0$ and hence $\delta$ cannot be a risk critical point. ∎

5 Hidden equation of polynomially perturbed case

In this section, we are going to discuss the hidden equations of the perturbed polynomial in the form $h_{\delta}(X)=p(X)-\delta q(X)$ . As before, assume $p(X)=p_{0}+p_{1}X+\ldots+p_{n}X^{n}$ and $q(X)=q_{0}+q_{1}X+\ldots q_{m}X^{m}$ are polynomials. This is a more complicated case compared to the constant perturbation discussed in Section 3.

Lemma 4.

Let $h_{\delta}(X)=p(X)-\delta q(X)$ where $\text{deg}(p(X))=n,\text{deg}(q(X))=m$ . The maximum order of the hidden polynomial $\text{Dis}(h_{\delta}(X))(\delta)$ has the following cases:

1.

$\text{deg}(\text{Dis}(h_{\delta}(X))(\delta))\leq 2m-2$ if $m>n$ and $q_{m}\delta\neq 0$ ,
2.

$\text{deg}(\text{Dis}(h_{\delta}(X))(\delta))\leq 2n-2$ if $m<n$ and $p_{n}\neq 0$ ,
3.

$\text{deg}(\text{Dis}(h_{\delta}(X))(\delta))\leq 2n-2$ if $m=n$ and $p_{n}-q_{n}\delta\neq 0.$

Proof.

Suppose $m>n$ . Using Lemma 6 in the Appendix, we can compute the discriminant in $\delta$ as follows,

\begin{split}&\text{Dis}(h_{\delta}(X))=\frac{(-1)^{n(n-1)/2}}{(-q_{m}\delta)}\text{Res}(h_{\delta}(X),h^{\prime}_{\delta}(X))\\ &=(-1)^{n(n-1)/2}(-q_{m}\delta)^{-1}\times\\ &\text{Det}\begin{bmatrix}-q_{m}\delta&-q_{m-1}\delta&\cdots&\cdots&p_{0}-q_{0}\delta&0&\cdots&0\\ 0&-q_{m}\delta&\cdots&\cdots&\cdots&p_{0}-q_{0}\delta&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots&\vdots&\ddots&p_{0}-q_{0}\delta\\ -mq_{m}\delta&-(m-1)q_{n-1}\delta&\cdots&\cdots&p_{1}-q_{1}\delta&0&\cdots&0\\ 0&-mq_{m}\delta&(m-1)q_{m-1}\delta&\cdots&\cdots&p_{1}-q_{1}\delta&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots&\vdots&\ddots&p_{1}-q_{1}\delta\\ \end{bmatrix}.\end{split}

(23)

Note that every row depends linearly on $\delta$ . We use the multi-linearity of determinant of the $(2m-1)\times(2m-1)$ Sylvester’s matrix. We observe that it is a polynomial in $\delta$ of degree $2m-1$ . Multiplying it by $(-1)^{n(n-1)/2}(-q_{m}\delta)^{-1}\neq 0$ , it becomes a polynomial in $\delta$ with maximum degree of $2m-2$ . The proofs of the other cases are very similar. ∎

Let $\mathcal{Z}(h_{\delta}(X))=\mathcal{Z}_{m}(h_{\delta}(X))\cup\mathcal{Z}^{\prime}(h_{\delta}(X))$ . The set $\mathcal{Z}_{m}(h_{\delta}(X))$ is the set of zeroes of the leading coefficient of $h_{\delta}(X)$ , and $\mathcal{Z}^{\prime}(h_{\delta}(X))=\mathcal{Z}(h_{\delta}(X))\setminus\mathcal{Z}_{m}(h_{\delta}(X))$ be the set of $\delta$ which are zeroes of the discriminant but not of the leading coefficient. Using (31) in Definition 8 of the Appendix, we note that $\delta$ can be a zero of the discriminant of $h_{\delta}(X)$ either by being a zero of the resultant or a zero of the leading coefficient. The latter occurs when $\delta\in\mathcal{Z}_{m}(h_{\delta}(X))$ .

With the help of Lemma 4, we can find the candidates for the risk critical points by solving the hidden equations in $\delta$ . However, as we have seen in the previous sections, not all roots of the hidden equations are guaranteed to be risk critical points. For instance, if $\delta\in\mathcal{Z}_{m}(h_{\delta}(X))$ , we have the following corollary.

Corollary 2.

If $\delta_{0}\in\mathcal{Z}_{m}(h_{\delta}(X))$ , then $\delta_{0}$ is a candidate for risk critical point irrespective of the branching order of the root $r_{j}(\delta_{0})$ .

Proof.

Using Theorem 3(c) in the Appendix, if $\delta_{0}\in\mathcal{Z}_{m}(h_{\delta}(X))$ , then $\delta_{0}$ is zero of the leading coefficient of $h_{\delta_{0}}(X)$ with multiplicity 1 because the leading coefficient of $h_{\delta_{0}}(X)$ is linear in $\delta_{0}$ . The root $r_{j}(\delta)$ has a Laurent-Puiseux series representation

r_{j}(\delta)=\sum_{k=-1}^{\infty}c_{k}(\delta-\delta_{0})^{k/m^{\prime}},

(24)

where $m^{\prime}>0$ is the order of the branching point. If we take the derivative of the first term of $r_{j}(\delta)$ , we have $\frac{-c_{-1}}{m^{\prime}}(\delta-\delta_{0})^{\frac{-1}{m^{\prime}}-1}$ , this will diverge for any $m^{\prime}>0$ as $\delta\to\delta_{0}$ . ∎

Remark 3: In this section the perturbed polynomial was of the form $h_{\delta}(X)=p_{0}(\delta)+p_{1}(\delta)X+\ldots+p_{n}(\delta)X^{n}.$ In this case there are two hidden equations associated with characterization of risk critical points. The first, as before, is ( $*$ ‣ 2) and the second where the leading coefficient is 0, namely

p_{n}(\delta)=0.

(

**

)

6 Illustration via Simulations

In this section, we present examples demonstrating some of the key results derived earlier. In particular, we show the importance of the connection between the distribution of the underlying random variable and the location of the roots of the perturbed polynomials. We are going to show three numerical examples of risk critical points. The probabilities of the interval events $I_{j}(\delta)$ were calculated using Monte Carlo method and roots of the perturbed polynomials were derived manually.

Example 2: Let $p_{\delta}(X)=X^{2}(X-2)-\delta$ . Here the hidden equation is $\text{Dis}(p_{\delta}(X))=-\delta(27\delta+32)=0$ , and we have two candidates for the risk critical points, $\delta=0$ and $\delta=-\frac{32}{27}.$ When $\delta=0$ , $X=0$ is a root with even multiplicity and when $\delta=-\frac{32}{27}$ , $X=4/3$ is a root with even multiplicity. Hence we simulated the threshold risk using random variables $X\sim\mathbb{N}(0,1^{2})$ and $X\sim\mathbb{N}(4/3,1^{2})$ respectively.

Refer to caption — (a) $X\sim\mathbb{N}(0,1^{2})$

Figure 1 demonstrates high sensitivity of $R(\delta)$ in the neighbourhoods of $\delta=0$ and $\delta=-\frac{32}{27}.$

Example 3: Let $p_{\delta}(X)=X^{2}-\delta X$ . Here the hidden equation is $\text{Dis}(p_{\delta}(X))=\delta^{2}=0$ , we have one candidate for a risk critical point, $\delta=0.$ When $\delta=0$ , $X=0$ is a root of $p_{\delta}(X)$ . Hence we simulated the risk using random variable $X\sim\mathbb{N}(0,1^{2})$ .

Figure 2 once again shows the sensitivity of threshold risk in the neighbourhood of $\delta=0$ . However, note that when $\delta$ increases from $0$ , by the symmetry of normal distribution, the threshold risk increases rapidly from $0$ towards $0.5$ . Similarly, for $\delta$ decreasing from $0$ .

Example 4: Let $p_{\delta}(X)=X-\delta X^{2}$ . Here the $\text{Dis}(p_{\delta}(X))=1$ , hence the hidden equation ( $*$ ‣ 2) has no roots. In accordance with Corollary 2, the remaining hidden equation ( $**$ ‣ 5) corresponds to the leading coefficient becoming 0. In this case, we have only one candidate for the risk critical point, $\delta=0$ . However, note that as $\delta$ approaches $0$ from above, the non-zero root of $p_{\delta}(X)=0$ approaches $\infty$ . Hence, the sensitivity of the threshold risk only manifests itself for distributions with sufficiently heavy tails. Hence we simulate the risk using random variable $X\sim\text{Cauchy}(x_{0}=0,\gamma=1)$ .

Figure 3 exhibits the sensitivity of threshold risk in the neighbourhood of the risk critical point $\delta=0$ .

Appendix A

For the sake of completeness, we recall a number of important relationships involving polynomials, resultants and the discriminant. While proofs of some of these results can be found in many sources we cite mainly the widely used reference [5]. We also cite [1] because the latter contains the proof of Theorem 3 that is not easily found elsewhere.

Definition 6.

For real polynomials $f(x)=a_{0}+a_{1}x+\ldots+a_{n}x^{n}$ and $g(x)=b_{0}+b_{1}x+\ldots+b_{m}x^{m}$ , with $\text{deg}(f)=n,\text{deg}(g)=m$ , their resultant $\text{Res}(f,g)$ is the determinant of the $(m+n)\times(m+n)$ Sylvester matrix, given by

\text{Res}(f,g)=\text{Det}\begin{bmatrix}a_{n}&a_{n-1}&\cdots&\cdots&a_{0}&0&\cdots&0\\ 0&a_{n}&\cdots&\cdots&a_{1}&a_{0}&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots&\vdots&\ddots&a_{0}\\ b_{m}&b_{m-1}&\cdots&\cdots&b_{0}&0&\cdots&0\\ 0&b_{m}&b_{m-1}&\cdots&\cdots&b_{0}&\cdots&0\\ \vdots&\vdots&\ddots&\ddots&\vdots&\vdots&\ddots&b_{0}\\ \end{bmatrix}.

(25)

Theorem 2.

For real polynomials $f(x)=a_{0}+a_{1}x+\ldots+a_{n}x^{n}$ and $g(x)=b_{0}+b_{1}x+\ldots+b_{m}x^{m}$ , suppose that $f$ has roots $\alpha_{1},\ldots,\alpha_{n}$ and $g$ has roots $\beta_{1},\ldots,\beta_{m}$ (not necessarily distinct). Then the resultant can be computed as

\text{Res}(f,g)=a_{n}^{m}b_{m}^{n}\prod_{i=1}^{n}\prod_{j=1}^{m}(\alpha_{i}-\beta_{j}).

(26)

Proof.

See reference [5] page 408. ∎

Lemma 5.

For real polynomials $f(x)=a_{0}+a_{1}x+\ldots+a_{n}x^{n}$ and $g(x)=b_{0}+b_{1}x+\ldots+b_{m}x^{m}$ , where $m\leq n$ , suppose that $f$ has roots $\alpha_{1},\ldots,\alpha_{n}$ and $g$ has roots $\beta_{1},\ldots,\beta_{m}$ (not necessarily distinct). Then the resultant can be computed as

\text{Res}(f,g)=a_{n}^{m}\prod_{i=1}^{n}g(\alpha_{i}),

(27)

where $g(x)=b_{m}\prod_{j=1}^{m}(x-\beta_{j})$ , and $g(\alpha_{i})=b_{m}\prod_{j=1}^{m}(\alpha_{i}-\beta_{j})$ .

Proof.

Follows immediately from Theorem 2. ∎

Definition 7.

Let $f(x)=a_{0}+a_{1}x+a_{2}x^{2}+\ldots+a_{n}x^{n}$ be a real polynomial, the discriminant of $f$ is

\text{Dis}(f)=a_{n}^{2n-2}\prod_{1\leq i\leq j\leq n}(\alpha_{i}-\alpha_{j})^{2}.

(28)

where $\alpha_{1},\ldots,\alpha_{n}$ are the roots of $f$ (not necessarily distinct).

Lemma 6.

Let $f=a_{0}+a_{1}x+a_{2}x^{2}+\ldots+a_{n}x^{n}$ , the discriminant of $f$ is given by

\text{Dis}(f)=(-1)^{n(n-1)/2}a_{n}^{-1}\text{Res}(f,f^{\prime}).

(29)

Proof.

Follows from equation (1.23) on Page 404 of [5]. ∎

Definition 8.

Let

Q(x,z)=q_{n}(z)x^{n}+q_{n-1}(z)x^{n-1}+\ldots+q_{0}(z)

(30)

$Q(x,z)$ is a bivariate polynomial with the perturbation variable $z$ . Using (28), the discriminant of $Q(x,z)$ has the following form

\text{Dis}(Q,z)=q_{n}(z)\prod_{i<j}(\alpha_{i}(z)-\alpha_{j}(z))^{2},

(31)

where $\alpha_{1}(z),\ldots,\alpha_{n}(z)$ are the roots of $Q(x,z)$ .

Let $\mathcal{Z}(Q)=\mathcal{Z}_{n}(Q)\cup\mathcal{Z}^{\prime}(Q)$ be the zero set of $\text{Dis}(Q,z_{0})$ . More specifically, $\mathcal{Z}_{n}(Q)=\{z|q_{n}(z)=0\}$ and $\mathcal{Z}^{\prime}(Q)=\{z|\text{Dis}(Q,z)=0,q_{n}(z)\neq 0\}$ . The following theorem provides the algebraic analytic form of the root function $x=x(z)$ in various situations with respect to the nature of the point $z$ . We note that, in some cases, the latter is an analytic multi-valued function $f(z)$ defined in a punctured neighborhood of $z$ satisfying $Q(f(z),z)=0$ for all $z$ in the complement of $\mathcal{Z}(Q)$ . In those cases, the type of series expansion that results depends on the limiting properties of $f$ when z approaches $z$ , as stated more precisely in the theorem.

Theorem 3.

(Classification of root expansions [1])

(a)

If $z_{0}\notin\mathcal{Z}(Q)$ and is not a zero of $q_{n}(z)$ , then in a neighborhood of $z_{0}$ every one of the $n$ branches of the solution $x(z)$ is holomorphic, and so it has the analytic representation

$x(z)=\sum_{k=0}^{\infty}c_{k}(z-z_{0})^{k}.$ (32)
(b)

If $z_{0}\in\mathcal{Z}^{\prime}(Q)$ , then $z_{0}$ is a branching point of some order $n^{\prime}\leq n$ for every branch $f(z)$ of the solution $x(z)$ and also $\lim_{z\rightarrow z_{0}}f(z)=0$ . In this case the solution $x(z)$ has a Puiseux series representation

$x(z)=\sum_{k=0}^{\infty}c_{k}(z-z_{0})^{k/n^{\prime}}.$ (33)
(c)

If $z_{0}\in\mathcal{Z}_{n}(Q)$ and is a zero of multiplicity $n_{0}>0$ of $q_{n}(z)$ , then for any branch $f(z)$ of $x(z)$ the point $z_{0}$ is a branching point of some order $n^{\prime}\leq n$ and $\lim_{z\rightarrow z_{0}}(z-z_{0})n^{n+\delta}f(z)=0$ for all $\delta>0$ . In this situation the solution $x(z)$ has a Laurent-Puiseux series representation

$x(z)=\sum_{k=-k_{0}}^{\infty}c_{k}(z-z_{0})^{k/n^{\prime}}.$ (34)
(d)

If $z_{0}\notin\mathcal{Z}(Q)$ and is the zero of multiplicity $m_{0}>0$ of $q_{m}(z)$ , then $z_{0}$ is a pole or order $m_{0}$ for every branch $f(z)$ of the solution $x(z)$ , and in this situation the solution $x(z)$ has a Laurent series representation

$x(z)=\sum_{k=-m_{0}}^{\infty}c_{k}(z-z_{0})^{k}.$ (35)

Proof.

See Theorem 4.8 on Page 93 [1]. ∎

References

Avrachenkov et al. [2013] Avrachenkov, K., J. A. Filar, and P. G. Howlett (2013). Analytic perturbation theory and its applications. Philadelphia: Society for Industrial and Applied Mathematics.
Ben-Tovim et al. [2018] Ben-Tovim, D., T. Bogomolov, J. Filar, P. Hakendorf, S. Qin, and C. Thompson (2018, September). Hospital’s instability wedges. Health Systems 9(3), 202–211.
Embrechts et al. [1997] Embrechts, P., C. Klüppelberg, and T. Mikosch (1997). Modelling Extremal Events. Berlin, Heidelberg: Springer Berlin Heidelberg.
Filar et al. [2020] Filar, J. A., Z. Qiao, and S. Streipert (2020). Risk sensitivity in beverton–holt fishery with multiplicative harvest. Natural Resource Modeling 33(3), e12257.
Gelfand et al. [1994] Gelfand, I. M., M. M. Kapranov, and A. V. Zelevinsky (1994). Discriminants, resultants, and multidimensional determinants. Mathematics : Theory & Applications. Boston: Birkhäuser.
Gourieroux et al. [2000] Gourieroux, C., J. Laurent, and O. Scaillet (2000). Sensitivity analysis of values at risk. Journal of Empirical Finance 7(3), 225 – 245. Special issue on Risk Management.
Klüppelberg and Stadtmüller [1998] Klüppelberg, C. and U. Stadtmüller (1998). Ruin probabilities in the presence of heavy-tails and interest rates. Scandinavian Actuarial Journal 1998(1), 49–58.
Varadhan [1984] Varadhan, S. R. S. (1984). Large deviations and applications. Number 46 in CBMS-NSF regional conference series in applied mathematics. Philadelphia, Pa: Society for Industrial and Applied Mathematics.