Robust Inference of One-Shot Device testing data with Competing Risk under Lindley Lifetime Distribution with an application to SEER Pancreas Cancer Data

Shanya Baghel^a Shuvashree Mondal^b [email protected]

Abstract

This article aims at the lifetime prognosis of one-shot devices subject to competing causes of failure. Based on the failure count data recorded across several inspection times, statistical inference of the lifetime distribution is studied under the assumption of Lindley distribution. In the presence of outliers in the data set, the conventional maximum likelihood method or Bayesian estimation may fail to provide a good estimate. Therefore, robust estimation based on the weighted minimum density power divergence method is applied both in classical and Bayesian frameworks. Thereafter, the robustness behaviour of the estimators is studied through influence function analysis. Further, in density power divergence based estimation, we propose an optimization criterion for finding the tuning parameter which brings a trade-off between robustness and efficiency in estimation. The article also analyses when the cause of failure is missing for some of the devices. The analytical development has been restudied through a simulation study and a real data analysis where the data is extracted from the SEER database.

keywords:

Density Power Divergence , Influence function, Lindley Distribution , Missing Cause of Failure, One shot devices, Optimal tuning parameter , Robust Bayes estimation

MSC:

62F10, 62F12, 62NO2.

¹¹footnotemark: 1

\affiliation

organization=Department of Mathematics and Computing, Indian Institute of Technology (Indian School of Mines) Dhanbad, postcode=826004, state=Jharkhand, country=India

\affiliation

organization=Department of Mathematics and Computing, Indian Institute of Technology (Indian School of Mines) Dhanbad, postcode=826004, state=Jharkhand, country=India

{highlights}

Robust classical inference based on weighted density power divergence

Robust Bayesian inference based on weighted density power divergence

Study of robustness through influence function

A new data-driven approach for optimal tuning parameter

Missing cause of failure analysis

Application to SEER Pancreas Cancer Data

1 Introduction

The prevalence of one-shot devices can be seen in various fields of human inventions. These devices remain torpid until activated and are immediately destroyed after use. The exact failure time of such devices is not observable in most instances. Therefore, the observation is limited to recording whether the device failed before or after a specified time, leading to dichotomous data. Several studies are devoted to the lifetime study of highly reliable one-shot devices under the accelerated life testing (ALT) setup.

Fan et al. [1] applied one-shot device testing data for the analysis of highly reliable electro-explosive devices using the Bayesian approach with an exponential, normal and beta prior distribution. Balakrishnan and Ling [2] were engaged in the classical inference of the lifetime distribution of one-shot devices with constant stress ALT under exponential lifetime distribution and compared it with the Bayesian framework developed by Fan et al. [1]. Balakrishnan and Ling [3] extended this work [2] for multiple stress models. Under constant stress ALT, Balakrishnan and Ling [4] assumed gamma lifetime distribution and drew classical inference in context one-shot devices data analysis.

There are many possible causes that can result in the failure of any one-shot device. For example, a fuse can be failed due to high temperature, humidity, voltage or overloading. The competing risk analysis confides in identifying the particular cause that caused the ultimate failure of a device. In survival analysis where one-shot device data corresponds to current-status data, there may be multiple causes of death in which there is one cause which occurs earliest and ultimately results in the demise of an individual. Crowder [5] thoroughly studied the statistical inference of classical competing risks. Several studies under competing risk setup were conducted thereafter [6, 7, 8, 9]. In the context of one-shot devices, Balakrishnan et al. [10, 11] drew classical inference with constant stress competing risk set up under exponential and Weibull lifetime distributions, respectively. Balakrishnan et al. [12] also went through the Bayesian approach for competing risk analysis of one-shot devices under exponential lifetime distribution.

Most of the work is focused on exponential, gamma and Weibull as lifetime distributions for one-shot device data in the literature. In this work, we come up with the study of statistical inference of the lifetime distribution of one-shot devices under independent competing risk setup where the lifetime under each risk follows two-parameter Lindley distribution. Lindley distribution was introduced by Lindley [13], [14] to analyse lifetime data, especially to study stress-strength reliability. Ghitany et al. [15] showed with evidence that the Lindley distribution offers better modelling than exponential distribution in terms of mathematical properties like mode, coefficient of variation, skewness, kurtosis and hazard rate shape. Lindley distribution has broad applicability in analysing failure time data. Mazucheli and Achcar [16] used it as an alternative to exponential or Weibull lifetime data under the competing risk setup.

In reality, it is quite likely to get data with outliers. The conventional inferential methods designed for ideal situations fail here, and robust inference methods are needed. In the literature, Basu et al. [17] introduced the density power divergence-based robust method of estimation. In one-shot devices data analysis, Balakrishnan et al. [18, 19, 20] carried the density power divergence based robust inference under gamma, Weibull and exponential lifetime models. It has been observed that very few works on competing risk with robust estimation are available in the literature. Only recently, Balakrishnan et al. [21], Balakrishnan and Castilla [22] developed robust inference under competing risk setup for one-shot devices. Further, it has been observed through extensive literature surveys that robust Bayes estimation has yet to be studied in the context of one-shot devices. The methodology of drawing a robust inference in the Bayesian model was first introduced by Hooker and Vidyashankar [23]. Further, Ghosh and Basu [24] developed robust Bayes estimation where the likelihood function has been substituted by the density power divergence measure in the posterior density function.

The present article focuses on developing a robust estimation method in classical and Bayesian frameworks where the failure of one-shot devices is subject to independent competing risks with an interval monitoring setup under Lindley lifetime distribution. The estimation procedure is based on the density power divergence measure between the true and assumed lifetime distributions. Certain weights are associated with the density power divergence measure and thereafter, a weighted minimum density power divergence estimator (WMDPDE) is obtained. Further, we study the asymptotic behaviour of the robust WMDPD estimator under this setup. With the availability of prior information, Bayesian inference is quite essential. In this work, following the idea by Ghosh and Basu [24], a weighted robust Bayes estimation (WRBE) is proposed where the robustified posterior density was developed on the exponential form of the maximiser equation based on density power divergence measure. The Normal distribution and Dirichlet distribution are taken as priors which are based on the data. Further, the robustness of the derived estimators is studied through the influence function. The numerical study is conducted through simulation experiments and real data analysis to assess the performances of the derived methods.

It is to be noted that in the estimation procedure based on the density power divergence method, the tuning parameter plays a pivotal role in bringing a balance between robustness and efficiency. Warwick and Jones [25] introduced a data-driven algorithm for choosing the optimal tuning parameter, which was further studied and modified by several authors ([26, 27, 28]). In this work, we propose an approach aiming to increase the robustness as well as precision of the estimation. In that direction, a numerical study is also presented.

The data analysis is conducted using information extracted from the database named “Incidence $\cdot$ SEER Research Data, 8 Registries, Nov 2021 Sub (1975-2019)” [29] which is maintained by National Cancer Institute (Surveillance, Epidemiology, and End Results Program (SEER)). The data is focused only on individuals diagnosed with pancreatic cancer from 2008-2010. The layout of data is given in Table (7). It was also observed that for some of the patients, the demise of the patient was recorded, but the cause of death was missing. This is frequently followed in real life, where the cause of the failure remains unspecified. Lu and Liang [30] studied that analysis using only known causes of failures would lead to biased results. To avoid this, the missing cause of failure analysis has been conducted separately in this work and the layout of the missing cause of failure data is given in Table (10). This real data set is analysed by exploiting the developed methods and the results are discussed thoroughly.

The rest of the article comprises sections divided as follows. Section 2 is concerned with model building and computation of maximum likelihood estimation. Section 3 focuses on deriving a robust WMDPD estimator and studying its asymptotic property. The robust Bayes method of estimation is introduced in section 4. In section 5, we study the robustness properties of the estimators through the influence function. Section 6 is devoted to the missing cause of failure analysis. Section 7 deals with the numerical study of the theoretical results developed in previous sections through a simulation study followed by the finding of optimal tuning parameter and data analysis. Concluding remarks are given in Section 8.

2 Model Description

This section is concerned with the model formulation for the lifetime data analysis of one-shot devices and the development of Maximum Likelihood Estimates (MLE).

One-shot devices are studied under accelerated life testing (ALT), where $G$ devices are distributed into $I$ independent groups. For $i=1,2,\dots,I,$ let the number of devices in $ith$ group be $g_{i}$ where $G=g_{1}+g_{2}+\dots+g_{I}$ . These devices are subject to $J$ type of stress factors, quantified by $s_{ij}$ ; $j=1,\dots,J$ , where $s_{i0}=1$ . The failure mechanism is subject to R independent competing causes. Let, $n_{ilr}$ denote the number of failures in the $ith$ group due to cause $r$ for $r=1,2,\dots,R,$ between the inspection times $(\tau_{i(l-1)},\tau_{il}]$ where $l=1,2,\dots,L$ , and $\tau_{i0}=0.$ Therefore, the number of devices that survive after inspection time $\tau_{iL}$ for the $ith$ group is $k_{i}=g_{i}-\sum_{l=1}^{L}\sum_{r=1}^{R}n_{ilr}$ . Table (1) shows the layout in tabular form.

Table 1: Model layout

Groups	Devices	Co-factors				Inspection
		Stress 1	Stress 2	…	Stress J	Times
1	$g_{1}$	$s_{11}$	$s_{12}$	…	$s_{1J}$	$\tau_{11}$	$\tau_{12}$	…	$\tau_{1L}$
2	$g_{2}$	$s_{21}$	$s_{22}$	…	$s_{2J}$	$\tau_{21}$	$\tau_{22}$	…	$\tau_{2L}$
⋮	⋮	⋮	⋮	$\udots$	⋮	⋮	⋮	$\udots$	⋮
i	$g_{i}$	$s_{i1}$	$s_{i2}$	…	$s_{iJ}$	$\tau_{i1}$	$\tau_{i2}$	…	$\tau_{iL}$
⋮	⋮	⋮	⋮	$\udots$	⋮	⋮	⋮	⋮	⋮
I	$g_{I}$	$s_{I1}$	$s_{I2}$	…	$s_{IJ}$	$\tau_{I1}$	$\tau_{I2}$		$\tau_{IL}$
Groups	Survivals	Failures
1	$k_{1}$	$(n_{111},\dots,n_{11R})$			…	$(n_{1L1},\dots,n_{1LR})$
2	$k_{2}$	$(n_{211},\dots,n_{21R})$			…	$(n_{2L1},\dots,n_{2LR})$
⋮	⋮	⋮			$\udots$	⋮
i	$k_{i}$	$(n_{i11},\dots,n_{i1R})$			…	$(n_{iL1},\dots,n_{iLR})$
⋮	⋮	⋮			$\udots$	⋮
I	$k_{I}$	$(n_{I11},\dots,n_{I1R})$			…	$(n_{IL1},\dots,n_{ILR})$

In $i$ th group for $i=1,2,\dots,I,$ the failure time due to $r$ th competing cause for $r=1,\ldots,R,$ is denoted by the random variable $T_{ir},$ which is assumed to follow two-parameter Lindley distribution with shape parameter $\alpha_{ir}$ and scale parameter $\theta_{ir}$ . The cumulative distribution function and probability density function of $T_{ir}$ are as follows.

	$\displaystyle F_{ir}(t)$	$\displaystyle=1-\left(\frac{1+\alpha_{ir}\theta_{ir}+\theta_{ir}t}{\alpha_{ir}\theta_{ir}+1}\right)e^{-\theta_{ir}t}\;;\;(t,\theta_{ir})>0,\alpha_{ir}+\theta_{ir}>0,$
	$\displaystyle f_{ir}(t)$	$\displaystyle=\frac{\theta_{ir}^{2}}{\alpha_{ir}\theta_{ir}+1}\left(\alpha_{ir}+t\right)e^{-\theta_{ir}t}\;;\;(t,\theta_{ir})>0,\alpha_{ir}+\theta_{ir}>0.$

Both shape and scale parameters are related to stress factors in log-linear form as

\alpha_{ir}=exp\left\{\sum_{j=0}^{J}a_{rj}s_{ij}\right\},\;\theta_{ir}=exp\left\{\sum_{j=0}^{J}b_{rj}s_{ij}\right\}\;;\;r=1,2,\dots,R.

For the computational conveniences, two competing causes of failure and one stress factor are considered without the additive constant and denote $s_{i1}=s_{i}.$ Hence, $\bm{\Lambda}=\{(a_{1},b_{1},a_{2},b_{2})\}^{{}^{\prime}}$ are the model parameters to be estimated.

The failure probabilities $p_{il1}$ , $p_{il2}$ due to cause 1, cause 2, respectively, in the interval $(\tau_{i(l-1)},\tau_{il})$ for $l=1,\ldots,L$ and the survival probability are obtained as follows.

	$\displaystyle p_{il1}$	$\displaystyle=P(\tau_{i(l-1)}<T_{i1}\leq\tau_{il},T_{i2}>T_{i1})$
	$\displaystyle p_{il2}$	$\displaystyle=P(\tau_{i(l-1)}<T_{i2}\leq\tau_{il},T_{i1}>T_{i2})$
	$\displaystyle p_{i0}$	$\displaystyle=P(T_{il1g}>\tau_{iL},T_{il2g}>\tau_{iL}).$

Result 1.

Under the assumption of Lindley distribution,

	$\displaystyle p_{il1}$	$\displaystyle=\frac{\theta_{i1}^{2}\theta_{i}^{-3}A^{(p_{1})}_{i}}{\prod_{r=1}^{2}(\alpha_{ir}\theta_{ir}+1)}\;;\;p_{il2}=\frac{\theta_{i2}^{2}\theta_{i}^{-3}A^{(p_{2})}_{i}}{\prod_{r=1}^{2}(\alpha_{ir}\theta_{ir}+1)};$
	$\displaystyle p_{i0}$	$\displaystyle=\prod_{r=1}^{2}\left(\frac{1+\alpha_{ir}\theta_{ir}+\tau_{iL}\theta_{ir}}{\alpha_{ir}\theta_{ir}+1}\right)e^{-\tau_{iL}\theta_{ir}}.$		(1)

Proof.

The proof of the result and description of notations are given in the appendix. ∎

Based on the observed failure count data, the Likelihood function can be obtained as,

L(\bm{\bm{\Lambda}})\propto\prod_{i=1}^{I}\left[(p_{i0})^{k_{i}}\prod_{l=1}^{L}\prod_{r=1}^{2}(p_{ilr})^{n_{ilr}}\right].

(2)

Therefore, the log-likelihood function is given as,

ln\,L(\bm{\bm{\Lambda}})\propto\sum_{i=1}^{I}\left[k_{i}\,ln\,(p_{i0})]+\sum_{l=1}^{L}\sum_{r=1}^{2}n_{ilr}\,ln\,(p_{ilr})\right].

(3)

Hence MLE of $\bm{\bm{\Lambda}},$ denoted by $\;\hat{\bm{\bm{\Lambda}}}=\{\hat{a}_{1},\hat{b}_{1},\hat{a}_{2},\hat{b}_{2}\}$ would be derived as

\hat{\bm{\bm{\Lambda}}}=arg\mathop{max}_{\bm{\bm{\Lambda}}}lnL(\bm{\bm{\Lambda}}).

(4)

provided $\sum_{i=1}^{I}\sum_{l=1}^{L}\sum_{r=1}^{2}n_{ilr}>0.$

Result 2.

The set of estimating equations for obtaining MLE is given as follows:

\sum_{i=1}^{I}s_{i}\bm{\Theta}_{i}\left[\frac{k_{i}}{p_{i0}}B_{i0}^{(\bm{\bm{\Lambda}})}C_{i}^{(p_{0})}+\sum_{l=1}^{L}\sum_{r=1}^{2}\frac{n_{ilr}}{p_{ilr}}B_{ilr}^{(\bm{\bm{\Lambda}})}C_{i}^{(p_{r})}\right]=\mathbf{0_{4}}.

(5)

Proof.

The proof of the result and description of notations are given in the appendix. ∎

In the presence of outliers in the data set, MLE cannot provide a valid estimated value. Therefore, some robust estimation method needs to be developed.

3 Robust point estimation

In this section, a robust estimation method based on the density power divergence (DPD) measure will be discussed. The DPD-based estimation was proposed by Basu et al. [17]. Under the assumption that two probability distributions $F$ and $G$ having probability densities $f$ and $g$ respectively, the DPD measure between $g$ and $f$ is given as,

D_{\gamma}(g,f)=\int\Bigg{\{}f^{1+\gamma}(t)-\left(1+\frac{1}{\gamma}\right)g(t)\,f^{\gamma}(t)+\left(\frac{1}{\gamma}\right)g^{1+\gamma}(t)\Bigg{\}}~{}dt\,;\quad 0<\gamma\leq 1.

where $\gamma$ is termed as the tuning parameter.

For one-shot device data, the DPD measure is computed between empirical and theoretical probability distributions. The empirical failure probability due to two different causes and empirical survival probability is defined as,

\left(\hat{q}_{il1},\hat{q}_{il2},\hat{q}_{i0}\right)=\left(\frac{n_{il1}}{g_{i}},\frac{n_{il2}}{g_{i}},\frac{k_{i}}{g_{i}}\right)\,;\;i=1,2,\dots,I\,;l=1,2,\dots,L.

(6)

where the theoretical failure probabilities and survival probabilities are given by equation (1). Applying weights proportional to the size of groups, the weighted DPD (WDPD) measure with the weights, $w_{i}=\frac{g_{i}}{G}$ combining all the I groups is obtained as,

	$\displaystyle D^{w}_{\gamma}(\bm{\bm{\Lambda}})=\sum_{i=1}^{I}\frac{g_{i}}{G}$	$\displaystyle\left[\left\{(p_{i0})^{\gamma+1}+\sum_{l=1}^{L}\sum_{r=1}^{2}(p_{ilr})^{\gamma+1}\right\}\right.-\frac{\gamma+1}{\gamma}\bigg{\{}(\hat{q}_{i0}p_{i0}^{\gamma})+\Bigg{.}$
		$\displaystyle\left.\sum_{l=1}^{L}\sum_{r=1}^{2}\hat{q}_{ilr}(p_{ilr})^{\gamma}\right\}+\left.\frac{1}{\gamma}\left\{(\hat{q}_{i0})^{\gamma+1}+\sum_{l=1}^{L}\sum_{r=1}^{2}(\hat{q}_{ilr})^{\gamma+1}\right\}\right].$		(7)

When $\gamma$ tends to $0,$ $D^{w}_{\gamma}(\bm{\bm{\Lambda}})$ will converge to weighted Kullback-Leibler (KL) divergence measure $D_{KL}(\bm{\bm{\Lambda}})$ where

\displaystyle D_{KL}(\bm{\bm{\Lambda}})

\displaystyle=\sum_{i=1}^{I}\frac{g_{i}}{G}\hat{q}_{i0}\,ln\left\{\frac{\hat{q}_{i0}}{p_{i0}}\right\}+\sum_{l=1}^{L}\sum_{r=1}^{2}\hat{q}_{ilr}\,ln\left\{\frac{\hat{q}_{ilr}}{p_{ilr}}\right\}.

DPD measure can also be written in the form of likelihood function as follows:

\mathop{lim}_{\gamma\to 0^{+}}D^{w}_{\gamma}(\bm{\bm{\Lambda}})=c-\frac{1}{G}\;lnL(\bm{\bm{\Lambda}}).

where $c$ contains the terms independent of parameters.
The weighted minimum density power divergence estimators (WMDPDE) for estimating $\bm{\bm{\Lambda}}$ can be obtained by minimizing the WDPD measure as follows:

\hat{\bm{\bm{\Lambda}}}_{\gamma}^{w}=arg\mathop{min}_{\bm{\bm{\Lambda}}}D^{w}_{\gamma}(\bm{\bm{\Lambda}}).

(8)

Note that, the tuning parameter $\gamma$ plays a role in bringing a balance between robustness and efficiency in this estimation.

Result 3.

The set of estimating equations for obtaining WMDPDEs is given as follows:

\sum_{i=1}^{I}g_{i}\,s_{i}\bm{\Theta}_{i}\left[Q_{i}B^{(\bm{\bm{\Lambda}})}_{i0}C_{i}^{(p_{0})}+\sum_{l=1}^{L}\sum_{r=1}^{2}Q_{ilr}B^{(\bm{\bm{\Lambda}})}_{ilr}C_{i}^{(p_{r})}\right]=\mathbf{0_{4}}.

(9)

Proof.

The proof of the theorem and description of notations are given in the appendix. ∎

As it is seen in the Results (2) and (3), the explicit form of the MLEs and WMDPDEs could not be found. Hence, the Co-ordinate Descent algorithm is used to obtain the estimates. The steps of the algorithm are given in Algorithm (1)

Algorithm 1 Coordinate-Descent Algorithm

•

Chose the initial values $\bm{\Lambda}_{0}=(a^{0}_{1},b_{1}^{0},a^{0}_{2},b_{2}^{0})$ .
•

At the $m+1^{th}$ iteration, the estimate of the parameters can be derived as
$a^{(m+1)}_{1}=a^{(m)}_{1}-\alpha\frac{\partial D(a_{1}^{(m)},b_{1}^{(m)},a_{2}^{(m)},b_{2}^{(m)})}{\partial a_{1}}\\ b^{(m+1)}_{1}=b^{(m)}_{1}-\alpha\frac{\partial D(a_{1}^{(m+1)},b_{1}^{(m)},a_{2}^{(m)},b_{2}^{(m)})}{\partial b_{1}}\\ a^{(m+1)}_{2}=a^{(m)}_{2}-\alpha\frac{\partial D(a_{1}^{(m+1)},b_{1}^{(m+1)},a_{2}^{(m)},b_{2}^{(m)})}{\partial a_{2}}\\ b^{(m+1)}_{2}=b^{(m)}_{2}-\alpha\frac{\partial D(a_{1}^{(m+1)},b_{1}^{(m+1)},a_{2}^{(m+1)},b_{2}^{(m)})}{\partial b_{2}}\\$ where $D=-ln\,L(\bm{\Lambda})$ for MLE and $D=D^{w}_{\gamma}(\bm{\Lambda})$ for WMDPDE and $\alpha$ is the learning rate.
•

Continue the process until { $max|\bm{\Lambda}^{(m+1)}-\bm{\Lambda}^{(m)}|\,,\,max|D(\bm{\Lambda}^{(m+1)})-D(\bm{\Lambda}^{(m)})|<c$ } where $c$ is a predefined threshold value.

To study the asymptotic behaviour of WMDPDE, the following theorem is presented, which is based on the idea of Calvino et al.[31].

Theorem 1.

Let $\bm{\Lambda}^{0}$ be the true value of the parameter $\bm{\Lambda}$ . The asymptotic distribution of WMDPDE of $\bm{\Lambda}$ , $\hat{\bm{\Lambda}}_{\gamma}$ , is given by,

\sqrt{G}(\hat{\bm{\Lambda}}_{\gamma}-\bm{\Lambda}^{0})\xrightarrow[G\to\infty]{\mathscr{L}}N\left(\mathbf{0_{4}},Q_{\gamma}^{-1}(\bm{\Lambda}^{0})R_{\gamma}(\bm{\Lambda}^{0})Q_{\gamma}^{-1}(\bm{\Lambda}^{0})\right)

(10)

Proof.

The proof of the theorem and description of notations are given in the appendix. ∎

4 Robust Bayes Method of Estimation

Bayesian inference is of paramount interest when some prior information is available about the model parameters. Conventional Bayes estimation based on likelihood-based posterior is quite popular because of several optimal properties. But the major drawback is that it can not produce a good estimated value when data come with contamination. In the literature, the non-robustness problem is tried to be solved by replacing the likelihood function in the posterior by some robust loss function and the derived posterior is called a pseudo posterior. Readers may see Greco et al. [32], Chérief et al. [33], Jewson et al. [34], Nakagawa and Hashimoto [35] in this reference. Inspired by Ghosh and Basu [24], a robust Bayesian estimation based on the power divergence measure is proposed here for the reliability analysis of a one-shot device with a competing risk interval monitoring set-up. The following developments are done in that direction.
Define,

	$\displaystyle B^{w}_{\gamma}(\bm{\Lambda})=\sum_{i=1}^{I}\frac{g_{i}}{G}$	$\displaystyle\left[\frac{1}{\gamma}\left\{(\hat{q}_{i0}p_{i0}^{\gamma})+\sum_{l=1}^{L}\sum_{r=1}^{2}\hat{q}_{ilr}(p_{ilr})^{\gamma}\right\}\right.$
		$\displaystyle\qquad\qquad\left.-\frac{1}{\gamma+1}\left\{(p_{i0})^{\gamma+1}+\sum_{l=1}^{L}\sum_{r=1}^{2}(p_{ilr})^{\gamma+1}\right\}\right].$		(11)

where WMDPDE with $\gamma>0$ is the maximizer of $B^{w}_{\gamma}(\bm{\Lambda})$ . Therefore, the weighted robust posterior density, a pseudo posterior can be defined as follows,

\pi^{w}_{\gamma}(\bm{\Lambda}|data)=\frac{\exp\left(B^{w}_{\gamma}(\bm{\Lambda})\right)\pi(\bm{\Lambda})}{\int\exp\left(B^{w}_{\gamma}(\bm{\Lambda})\right)\pi(\bm{\Lambda})\,d\bm{\Lambda}}.

(12)

Here, $\pi(\bm{\Lambda})$ is the joint prior density, and $\pi^{w}_{\gamma}(\bm{\Lambda}|data)$ is the proper density for $\gamma\geq 0$ . For $\gamma\rightarrow 0$ , the robust pseudo posterior will converge to the conventional likelihood-based posterior density. For any loss function $L(.,.),$ the Bayes estimator can be obtained as

arg\min_{t}\int L(\bm{\Lambda},t)\pi^{w}_{\gamma}(\bm{\Lambda}|data)d\bm{\Lambda}.

For the squared error loss function, the Weighted Robust Bayes Estimator (WRBE) can be obtained as

\hat{\bm{\Lambda}}^{w}_{b\gamma}=\int\bm{\Lambda}\pi^{w}_{\gamma}(\bm{\Lambda}|data)\,d\bm{\Lambda}.

(13)

4.1 Choice of Priors

In Bayesian inference choice of prior governs the estimation. In this section, we mention few such prior choices. For the model parameters $a_{1},a_{2},b_{1},b_{2},$ the interpretation of prior choice is not so meaningful. Following the idea of Fan et al. [1], the prior information on $p_{ilr}$ ’s is considered. For further development, we need the emperical estimates of $p_{ilr}$ ’s given in (6). But to avoid the zero-frequency situation, we follow the idea of Lee et al.(1985) [36] and define,

(\tilde{q}_{i0},\tilde{q}_{il1},\tilde{q}_{il2})=\left(\frac{k_{i}+1}{g_{i}+3},\frac{n_{il1}+1}{g_{i}+3},\frac{n_{il2}+1}{g_{i}+3}\right).

(14)

4.1.1 Normal Prior based on data

Define, error $\{e_{ilr}\}$ ’s as the difference between empirical estimates and the true probabilities as follows.

\tilde{q}_{ilr}=p_{ilr}+e_{ilr}\;;\quad i=1,2,\dots,I\;;\;l=1,2,\dots,L\;;\;r=1,2.

(15)

with the assumption that $e_{ilr}$ are independent $N(0,\sigma^{2})$ variables. Therefore, the conditional likelihood function as the prior distribution of $\bm{\Lambda}$ given $\sigma^{2}$ can be obtained as follows.

L(\bm{\Lambda}|\sigma^{2})\propto\prod_{i=1}^{I}\prod_{l=1}^{L}\prod_{r=1}^{2}\frac{1}{\sqrt{2\pi\sigma^{2}}}\exp\left\{-\frac{1}{2\sigma^{2}}(p_{ilr}-\tilde{q}_{ilr})^{2}\right\}.

(16)

Using non-informative prior of $\sigma^{2}$ , $\pi(\sigma^{2})\propto\frac{1}{\sigma^{2}},$ the joint prior density of $\bm{\Lambda}$ is given as follows.

\pi^{(1)}(\bm{\Lambda})\propto\int_{0}^{\infty}L(\bm{\Lambda}|\bm{\tau},\bm{s},\sigma^{2})\pi(\sigma^{2})d\sigma^{2}\propto\left\{\sum_{i=1}^{I}\sum_{l=1}^{L}\sum_{r=1}^{2}(p_{ilr}-\tilde{q}_{ilr})^{2}\right\}^{-IL}.

Then by (12), posterior density would be given as follows.

\pi^{(1)}_{\gamma}(\bm{\Lambda}|data)\propto\exp\left(B^{w}_{\gamma}(\bm{\Lambda})\right)\left\{\sum_{i=1}^{I}\sum_{l=1}^{L}\sum_{r=1}^{2}(p_{ilr}-\tilde{q}_{ilr})^{2}\right\}^{-IL}.

(17)

4.1.2 Dirichlet Prior based on data

If a parameter can be interpreted as some probability, beta prior is a very natural choice. Extending this idea, a Dirichlet prior is considered for the failure and survival probabilities as follows.

\pi_{i}^{(2)}=\frac{p_{i0}^{\beta_{i0}-1}\prod_{l=1}^{L}\prod_{r=1}^{2}p_{ilr}^{\beta_{ilr}-1}}{\bm{B}(\bm{\beta_{i}})}.

where, $\beta_{i0},\beta_{ilr}>0$ for $i=1,\ldots,I$ $l=1,\ldots,L$ , $r=1,2$ and

\bm{B}(\bm{\beta_{i}})=\frac{\Gamma\beta_{i0}\prod_{l=1}^{L}\prod_{r=1}^{2}\Gamma\beta_{ilr}}{\Gamma\left(\beta_{i0}+\sum_{l=1}^{L}\sum_{r=1}^{2}\beta_{ilr}\right)}.

The hyper-parameters are chosen equating the expectation of the failure and survival probabilities with their empirical estimates and equating variance as a known constant. Therefore, we get

E(p_{i0})=\frac{\beta_{i0}}{\beta_{i0}+\sum_{l=1}^{L}\sum_{r=1}^{2}\beta_{ilr}}=\tilde{q}_{i0}\;,\;E(p_{ilr})=\frac{\beta_{ilr}}{\beta_{i0}+\sum_{l=1}^{L}\sum_{r=1}^{2}\beta_{ilr}}=\tilde{q}_{ilr}\;;\;r=1,2

(18)

Var(p_{i0})=\frac{\beta_{i0}\left(\sum_{r=1}^{2}\beta_{ilr}\right)}{\left(\beta_{i0}+\sum_{l=1}^{L}\sum_{r=1}^{2}\beta_{ilr}\right)^{2}\left(\beta_{i0}+\sum_{l=1}^{L}\sum_{r=1}^{2}\beta_{ilr}+1\right)}=\sigma^{2}_{(p)}.

(19)

where, $\sigma^{2}_{(p)}$ is assumed to be a prefixed quantity. By equations (18) and (19) the estimates of hyper-parameters are,

	$\displaystyle\hat{\beta}_{il1}$	$\displaystyle=\tilde{q}_{il1}\left\{\frac{\tilde{q}_{i0}(1-\tilde{q}_{i0})}{\sigma^{2}_{(p)}}-1\right\}\;,\quad\hat{\beta}_{il2}=\tilde{q}_{il2}\left\{\frac{\tilde{q}_{i0}(1-\tilde{q}_{i0})}{\sigma^{2}_{(p)}}-1\right\}\quad\text{and}$
	$\displaystyle\hat{\beta}_{i0}$	$\displaystyle=\left\{\frac{\tilde{q}_{i0}(1-\tilde{q}_{i0})}{\sigma^{2}_{(p)}}-1\right\}-\sum_{l=1}^{L}\sum_{r=1}^{2}\hat{\beta}_{ilr}.$

Then by (12), the joint posterior density would be given as follows.

\pi^{(2)}_{\gamma}(\bm{\Lambda}|data)\propto\exp\left(B^{w}_{\gamma}(\bm{\Lambda})\right)\prod_{i=1}^{I}\left[p_{i0}^{\hat{\beta}_{i0}-1}\prod_{l=1}^{L}\prod_{r=1}^{2}\left\{p_{ilr}^{\hat{\beta}_{ilr}-1}\right\}\right].

(20)

Under both the prior assumption, the Bayes estimate cannot be obtained in closed form. Hence we rely on the Metropolis-Hastings (MH) [37] algorithm. The steps of the algorithm is given in algorithm (2).

Algorithm 2 Metropolis-Hastings (MH) Algorithm

•

Chose the initial value $\bm{\Lambda}_{0}=(a^{0}_{1},b_{1}^{0},a^{0}_{2},b_{2}^{0})$ .
•

For $i=1,2,\dots,N;$ generate a value $\bm{\Lambda}^{*}=(a^{*}_{1},b_{1}^{*},a^{*}_{2},b_{2}^{*})$ from proposal distribution $\pi^{*}_{\gamma}({\cdot}|\bm{\Lambda}_{i-1}).$ where $\bm{\Lambda}_{i-1}=(a^{i-1}_{1},b_{1}^{i-1},a^{i-1}_{2},b_{2}^{i-1}).$
•

Obtain acceptance probability $\alpha=min\left\{1,\frac{\pi^{*}_{\gamma}(\bm{\Lambda}_{i-1}|\bm{\Lambda}^{*})\pi^{(k)}_{\gamma}(\bm{\Lambda}^{*}|data)}{\pi^{*}_{\gamma}(\bm{\Lambda}^{*}|\bm{\Lambda}_{i-1})\pi^{(k)}_{\gamma}(\bm{\Lambda}_{i-1}|data)}\right\}$ for $k=1,2.$
•

Generate a random number $u\sim U(0,1)$ :
If $u<\alpha$ , set $\bm{\Lambda}_{i}=\bm{\Lambda}^{*},$ else $\bm{\Lambda}_{i}=\bm{\Lambda}_{i-1}$ where $\bm{\Lambda}_{i}=(a^{i}_{1},b_{1}^{i},a^{i}_{2},b_{2}^{i}).$

Note that, for each of the $a_{1},b_{1},a_{2},b_{2},$ the proposal distribution is chosen as

\pi_{\gamma}(\cdot|y)\sim\delta\,U(y-a,y+a)+(1-\delta)\,N(y,\sigma^{2})\;;\;0<\delta<1.

where $U(y-a,y+a)$ denotes a uniform distribution with range $(y-a,y+a)$ and $N(y,\sigma^{2})$ is the normal distribution with mean y and variance $\sigma^{2}$ . The mixture of Uniform and Normal distribution is taken so that a high acceptance rate can be achieved. For estimation purposes, a sequence of random variables is generated through the MH algorithm. Let the total number of generated values be $N$ . First say $N_{0}$ iterative values are removed for burn-in and after $N_{0}^{th}$ values onwards, every $m^{th}$ value is taken to reduce auto-correlation among generated values. In the simulation study, the specific values of $N$ , $N_{0}$ and $m$ are taken for the analysis. A total of say $N^{{}^{\prime}}$ values for each of $a_{1},b_{1},a_{2},b_{2}$ are finally obtained. Based on these obtained values, the Bayes estimates and the highest posterior density credible intervals (HPD CRI) of the model parameters can be approximated by using algorithm (3).

Algorithm 3 Bayes Estimates and HPD Credible Intervals

•

The approximated Bayes estimate of $a_{1}$ can be obtained by $\frac{1}{N^{{}^{\prime}}}\sum_{i=1}^{N^{{}^{\prime}}}a^{i}_{1}$ .
•

For $100(1-\xi)\%$ CRI of $a_{1}$ :
Sort $a^{i}_{1}$ ’s in ascending order to obtain $(a^{(1)}_{1},a^{(2)}_{1},\ldots,a^{(N^{{}^{\prime}})}_{1})$ and $(a^{(j)}_{1},a^{(j+[N^{{}^{\prime}}(1-\xi)])}_{1})$ for $j=1,\dots,[N^{{}^{\prime}}\xi]$ is the $100(1-\xi)\%$ credible inervals .
•

The $100(1-\xi)\%$ HPD CRI is $(a^{(j^{*})}_{1},a^{(j^{*}+[N^{{}^{\prime}}(1-\xi)])}_{1})$ such that $(a^{(j^{*})}_{1},a^{(j^{*}+[N^{{}^{\prime}}(1-\xi)])}_{1})\leq(a^{(j)}_{1},a^{(j+[N^{{}^{\prime}}(1-\xi)])}_{1});j=1,\dots,[N^{{}^{\prime}}\xi]$ .

Similar way, the Bayes estimates and the CRIs of $b_{1},a_{2},b_{2}$ can be obtained.

5 Property of Robustness

In previous sections we have derived the estimators of the model parameters using the robust estimation method. In this section robustness of those estimators will be studied through the Influence Function (IF) [38]. If $S(M)$ denotes the functional of any estimator from the true distribution M, then the IF is defined as follows.

IF(t;S,M)=\lim_{\epsilon\to 0}\frac{S(M_{\epsilon})-S(M)}{\epsilon}=\left.\frac{\partial S(M_{\epsilon})}{\partial\epsilon}\right|_{\epsilon=0}.

where $M_{\epsilon}=(1-\epsilon)M+\epsilon\Delta_{t}$ , $\epsilon(0<\epsilon<1)$ being the proportion and $\Delta_{t}$ being the degenerate distribution at the point t.

5.1 Influence Function of WMDPDE

Let M be the true distribution from which data is generated. If $S_{\gamma}(M)$ be the functional of WMDPDE $\hat{\bm{\Lambda}}_{\gamma}$ , then $S_{\gamma}(M)$ is the value of $\bm{\Lambda}$ which minimizes,

\displaystyle\sum_{i=1}^{I}\frac{g_{i}}{G}\left[\left\{p_{i0}^{\gamma+1}+\sum_{l=1}^{L}\sum_{r=1}^{2}p^{\gamma+1}_{ilr}\right\}-\frac{\gamma+1}{\gamma}\left\{\left(\int_{I_{i0}}dM\right)p^{\gamma}_{i0}+\sum_{l=1}^{L}\sum_{r=1}^{2}\left(\int_{I_{ilr}}dM\right)p^{\gamma}_{ilr}\right\}\right].

where, $\bm{t}=(t_{1},t_{2})$ with $t_{1},t_{2}\in R$ and $\bm{t}\in I_{il1}\implies(\tau_{i(l-1)}<t_{1}\leq\tau_{il},t_{2}>t_{1})$ ; $\bm{t}\in I_{il2}\implies(\tau_{i(l-1)}<t_{2}\leq\tau_{il},t_{1}>t_{2})$ ; $\bm{t}\in I_{i0}\implies(t_{1}>\tau_{il},t_{2}>\tau_{il})$ .

Result 4.

The influence function of $\hat{\bm{\Lambda}}_{\gamma}$ based on all I groups is given as follows.

	$\displaystyle\bm{IF(t;S_{\gamma},F_{\Lambda})}=Q_{\gamma}(\bm{\Lambda})^{-1}\sum_{i=1}^{I}\frac{g_{i}}{G}$	$\displaystyle\left[\left\{[\delta_{I_{i0}}(t)-p_{i0}]p_{i0}^{\gamma-1}\,\frac{\partial(p_{i0})}{\partial\Lambda}\right\}\right.$
		$\displaystyle\left.+\left\{\sum_{l=1}^{L}\sum_{r=1}^{2}[\delta_{I_{ilr}}(t)-p_{ilr}]p^{\gamma-1}_{ilr}\,\frac{\partial(p_{ilr})}{\partial\Lambda}\right\}\right].$

Proof.

The proof of the theorem and description of notations are given in the appendix. ∎

5.2 Influence Function of WRBE

The robustness property corresponding to robust Bayes estimators $\hat{\bm{\Lambda}}^{w}_{b\gamma}$ using IF [24] is studied through Bayes functional, which is given as follows concerning squared error loss function.

T^{(\gamma)}_{G}(M)=\frac{\int\bm{\Lambda}\exp\left\{B^{w}_{\gamma}(\bm{\Lambda};M,F_{\bm{\Lambda}})\right\}\pi(\bm{\Lambda})d\bm{\Lambda}}{\int\exp\left\{B^{w}_{\gamma}(\bm{\Lambda};M,F_{\bm{\Lambda}})\right\}\pi(\bm{\Lambda})d\bm{\Lambda}}.

Result 5.

The influence function of $\hat{\bm{\Lambda}}^{w}_{b\gamma}$ , based on all I groups, is given by the following expression.

\bm{IF(\bm{t};T_{G}^{(\gamma)},F_{\bm{\Lambda}})}=Cov_{(p)}\left(\bm{\Lambda},X_{\gamma}(\bm{\Lambda};\bm{t},f_{\bm{\Lambda}})\right).

Proof.

The proof of the theorem and description of notations are given in the appendix. ∎

6 Missing Cause of Failure Analysis

There are plenty of incidents where a device fails, but the reason for that failure cannot be identified. In such a situation, the unknown cause of failure is said to be masked or missing. In this section, we develop the previous estimation methods when cause of some of the failures are missing or unidentified.

Let us denote the number of failures of the $i^{th}$ group due to missing cause by $n_{im}$ . Then, total number of devices under observation in the $ith$ group is $g_{i}=k_{i}+n_{il1}+n_{il2}+n_{im}$ and $G=g_{1}+g_{2}+\dots+g_{I}$ . Therefore, the updated probabilities are described as follows,

	$\displaystyle P(\{\tau_{i(l-1)}<min(T_{i1},T_{i2})\leq\tau_{il}\}\cap\{missing\})=(1-p_{i0})p_{im}$
	$\displaystyle P(\{\tau_{i(l-1)}<T_{i1}\leq\tau_{il},T_{i2}>T_{i1}\}\cap\{not\;missing\})=p_{il1}(1-p_{im})$
	$\displaystyle P(\{\tau_{i(l-1)}<T_{i2}\leq\tau_{il},T_{i1}>T_{i2}\}\cap\{not\;missing\})=p_{il2}(1-p_{im})$

where $p_{im}$ denotes the probability of failure due to missing cause and $p_{i0},p_{il1},p_{il2}$ are described in Result (1). Therefore, in presence of some failures due to unidentified cause, the likelihood function becomes,

L_{m}\propto\prod_{i=1}^{I}\left[p_{i0}^{k_{i}}\big{(}(1-p_{i0})p_{im}\big{)}^{n_{im}}\prod_{l=1}^{L}\prod_{r=1}^{2}\big{(}p_{ilr}(1-p_{im})\big{)}^{n_{ilr}}\right].

The weighted density power divergence measure can be obtained as,

	$\displaystyle D_{\gamma}^{w}(\bm{\Lambda})_{m}$	$\displaystyle=\sum_{i=1}^{I}\frac{g_{i}}{G}\left[\left\{p_{i0}^{\gamma+1}+\{(1-p_{i0})p_{im})\}^{\gamma+1}+(1-p_{im})^{\gamma+1}\sum_{l=1}^{L}\sum_{r=1}^{2}p_{ilr}^{\gamma+1}\right\}\right.$
		$\displaystyle\qquad-\frac{\gamma+1}{\gamma}\left\{\hat{q}_{i0}p^{\gamma}_{i0}+\{(1-p_{i0})p_{im}\}^{\gamma}\hat{q}_{im}+(1-p_{im})^{\gamma}\sum_{l=1}^{L}\sum_{r=1}^{2}\hat{q}_{ilr}p_{ilr}^{\gamma}\right\}$
		$\displaystyle\left.\qquad+\frac{1}{\gamma}\left\{\hat{q}_{i0}^{\gamma+1}+\hat{q}^{\gamma+1}_{mi}+\sum_{l=1}^{L}\sum_{r=1}^{2}\hat{q}^{\gamma+1}_{ilr}\right\}\right].$

where $\hat{q}_{im}=\frac{n_{im}}{g_{i}}$ is the empirical failure probability for missing cause.

Therefore, the estimating equation for the WMDPDEs can be obtained as,

	$\displaystyle\sum_{i=1}^{I}g_{i}\,s_{i}\bm{\Theta}_{i}\left[B_{i0}^{(\bm{\Lambda})}C_{i}^{(p_{0})}Q_{i}^{(m)}+(1-p_{im})^{\gamma}\sum_{l=1}^{L}\sum_{r=1}^{2}B_{ilr}^{(\bm{\Lambda})}C_{i}^{(p_{r})}Q_{ilr}\right]=\mathbf{0}_{4},$
	$\displaystyle\sum_{i=1}^{I}g_{i}\left[p_{im}^{\gamma-1}\left\{(1-p_{i0})^{\gamma+1}p_{im}-p_{i0}^{\gamma}\hat{q}_{im}\right\}-(1-p_{im})^{\gamma-1}\right.$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\quad\left.\left\{\sum_{l=1}^{L}\sum_{r=1}^{2}p_{ilr}^{\gamma}\left((1-p_{im})p_{ilr}-\hat{q}_{ilr}\right)\right\}\right]=0.$

where,

	$\displaystyle Q_{i}^{(m)}$	$\displaystyle=p_{i0}^{\gamma-1}(p_{i0}-q_{i0})+p_{im}^{\gamma-1}(1-p_{i0})^{\gamma-1}\left\{p_{im}(1-p_{i0})-\hat{q}_{im}\right\},$
	$\displaystyle Q_{ilr}$	$\displaystyle=p_{ilr}^{\gamma-1}(p_{ilr}-\hat{q}_{ilr}).$

Also, the weighted robust Bayes estimator concerning the missing cause of failure can be modified to,

\hat{\bm{\Lambda}}^{w}_{(im)}=\int\bm{\Lambda}\pi^{w}_{(im)}(\bm{\Lambda}|\bm{\tau},\bm{s})\,d\bm{\Lambda}.

where,

\pi^{w}_{(im)}(\bm{\Lambda}|\bm{\tau},\bm{s})=\frac{\exp\left(B^{w}_{\gamma}(\bm{\Lambda})_{(im)}\right)\pi(\bm{\Lambda})}{\int\exp\left(B^{w}_{\gamma}(\bm{\Lambda})_{(im)}\right)\pi(\bm{\Lambda})\,d\bm{\Lambda}},

	$\displaystyle B^{w}_{\gamma}(\bm{\Lambda})_{(im)}$	$\displaystyle=\sum_{i=1}^{I}\frac{g_{i}}{G}\left[\frac{1}{\gamma}\left\{\hat{q}_{i0}p^{\gamma}_{i0}+\{(1-p_{i0})p_{im}\}^{\gamma}\hat{q}_{im}+(1-p_{im})^{\gamma}\sum_{l=1}^{L}\sum_{r=1}^{2}\hat{q}_{ilr}p_{ilr}^{\gamma}\right\}\right.$
		$\displaystyle\qquad\left.-\frac{1}{\gamma+1}\left\{p_{i0}^{\gamma+1}+\{(1-p_{i0})p_{im})\}^{\gamma+1}+(1-p_{im})^{\gamma+1}\sum_{l=1}^{L}\sum_{r=1}^{2}p_{ilr}^{\gamma+1}\right\}\right].$

The rest of the analysis of the missing causes of failure is similar to the analysis based on the methods developed in the previous sections. Hence it is omitted here.

7 Numerical Study

In this section, the performance of the theoretical results developed in previous sections has been assessed numerically through simulation experiments and real data analysis.

7.1 Simulation Analysis

For simulation purposes, 75 one-shot devices following Lindley Lifetime distribution are put to the accelerated life testing experiment under two independent competing causes of failures.

Table 2: Model layout for simulation study

Groups	Devices	Stress Levels	Inspection Times
1	20	0.1	0.5	1.0	2.0
2	25	0.2	1.0	2.0	3.0
3	30	0.3	0.5	1.0	2.0

These devices are divided into three independent groups and are subject to three stress levels. The layout for the simulation experiment is given in Table (2).

Table 3: Model parameter values for simulation experiment

$\bm{\bm{\Lambda}}$	Pure Data				Contaminated Data
$\bm{\bm{\Lambda}}$	Cause 1		Cause 2		Cause 1		Cause 2
	$\bm{a}_{1}$	$\bm{b}_{1}$	$\bm{a}_{2}$	$\bm{b}_{2}$	$\bm{a}_{1}$	$\bm{b}_{1}$	$\bm{a}_{2}$	$\bm{b}_{2}$
$\bm{\bm{\Lambda}}_{1}$	1	-3	-2	2	1.9	-3.9	-2.9	2.9
$\bm{\bm{\Lambda}}_{2}$	1.5	-3.5	-2	1.5	2.0	-4.0	-2.5	2.4

Two sets of model parameters are taken for studying the performance of MLE, WMDPDE, conventional Bayes Estimate (BE) and WRBE. Two different contamination schemes are considered to analyse the robustness of WMDPDE and WRBE over MLE and BE. These are given in Table (3).

Table 4: Bias of MLE and WMDPDE

	Pure Data				Contaminated Data
	$\hat{a}_{1}$	$\hat{b}_{1}$	$\hat{a}_{2}$	$\hat{b}_{2}$	$\tilde{a}_{1}$	$\tilde{b}_{1}$	$\tilde{a}_{2}$	$\tilde{b}_{2}$
$\bm{\Lambda}=\bm{\Lambda}_{1}$
MLE	0.1726	0.2069	-0.0996	0.4728	0.3512	1.2796	-0.1552	2.0770
WMDPDE
$\mathbf{\gamma=0.2}$	0.2579	0.1849	-0.1363	0.6768	0.2340	0.1993	-0.1462	0.7139
$\mathbf{\gamma=0.4}$	0.1830	0.1010	-0.1000	0.4058	0.1646	0.1138	-0.0679	0.5573
$\mathbf{\gamma=0.6}$	0.1568	-0.0916	-0.0886	0.1259	0.3278	-0.2082	-0.1457	0.5272
$\mathbf{\gamma=0.8}$	0.2857	-0.2785	-0.1837	0.1244	0.2114	-0.1456	-0.1130	0.2656
$\mathbf{\gamma=1.0}$	0.1465	-0.1182	-0.0851	0.1114	0.2392	-0.2147	-0.1398	0.1530
$\bm{\Lambda}=\bm{\Lambda}_{2}$
MLE	0.1328	0.1780	-0.0086	0.8102	0.5767	1.2133	-0.0249	1.0538
WMDPDE
$\mathbf{\gamma=0.2}$	0.1114	0.2941	0.0131	0.3607	0.1071	0.3833	0.0209	0.4418
$\mathbf{\gamma=0.4}$	0.1444	0.4051	0.0271	0.3207	0.1641	0.4177	0.0215	0.4162
$\mathbf{\gamma=0.6}$	0.0999	0.1478	0.0570	0.1654	0.1110	0.2828	0.0166	0.2692
$\mathbf{\gamma=0.8}$	0.1265	0.0180	-0.0213	0.2151	0.1092	0.1318	0.0100	0.3278
$\mathbf{\gamma=1.0}$	0.1322	-0.0411	-0.0279	0.1378	0.1195	-0.0422	-0.0068	0.2256

Table 5: Bias of BE and WRBE

Normal Prior
	Pure Data				Contaminated Data
	$\hat{a}_{1}$	$\hat{b}_{1}$	$\hat{a}_{2}$	$\hat{b}_{2}$	$\tilde{a}_{1}$	$\tilde{b}_{1}$	$\tilde{a}_{2}$	$\tilde{b}_{2}$
$\bm{\Lambda}=\bm{\Lambda}_{1}$
BE	-0.0102	-0.0203	-0.0082	0.0455	-0.2560	-0.4366	0.0586	0.5795
WRBE
$\mathbf{\gamma=0.2}$	-0.0137	-0.0983	-0.0808	0.1417	-0.0127	-0.0977	-0.0922	0.1514
$\mathbf{\gamma=0.4}$	-0.0158	-0.0982	-0.0894	0.1405	-0.0155	-0.1021	-0.0876	0.1432
$\mathbf{\gamma=0.6}$	-0.0140	-0.0991	-0.0870	0.1407	-0.0133	-0.1055	-0.0796	0.1376
$\mathbf{\gamma=0.8}$	-0.0161	-0.0915	-0.0913	0.1374	-0.0138	-0.1063	-0.0870	0.1380
$\mathbf{\gamma=1.0}$	-0.0037	-0.1006	-0.0865	0.1376	-0.0148	-0.1001	-0.0801	0.1489
$\bm{\Lambda}=\bm{\Lambda}_{2}$
BE	-0.1024	-0.0353	0.0155	0.0437	-0.3555	-0.2476	0.0534	0.5532
WRBE
$\mathbf{\gamma=0.2}$	-0.1032	-0.1049	-0.0886	0.1371	-0.0981	-0.0995	-0.0849	0.1443
$\mathbf{\gamma=0.4}$	-0.1002	-0.0972	-0.0821	0.1379	-0.0995	-0.1002	-0.0846	0.1407
$\mathbf{\gamma=0.6}$	-0.0979	-0.1055	-0.0897	0.1467	-0.1049	-0.1004	-0.0882	0.1508
$\mathbf{\gamma=0.8}$	-0.1024	-0.1037	-0.0896	0.1305	-0.1021	-0.1003	-0.0839	0.1484
$\mathbf{\gamma=1.0}$	-0.1055	-0.1022	-0.0790	0.1450	-0.0934	-0.0944	-0.0815	0.1455
Dirichlet Prior
$\bm{\Lambda}=\bm{\Lambda}_{1}$
BE	0.0671	-0.0341	-0.0114	0.0151	0.4903	-0.8309	-0.0806	0.4389
WRBE
$\mathbf{\gamma=0.2}$	0.0848	-0.1418	-0.0964	0.1214	0.0873	-0.1311	-0.0711	0.2039
$\mathbf{\gamma=0.4}$	0.0863	-0.1400	-0.0947	0.1351	0.0912	-0.1305	-0.0641	0.2092
$\mathbf{\gamma=0.6}$	0.0854	-0.1459	-0.0867	0.1367	0.0882	-0.1339	-0.0693	0.2092
$\mathbf{\gamma=0.8}$	0.0859	-0.1416	-0.0912	0.1333	0.0918	-0.1347	-0.0736	0.2087
$\mathbf{\gamma=1.0}$	0.0794	-0.1417	-0.0963	0.1351	0.0934	-0.1354	-0.0671	0.2025
$\bm{\Lambda}=\bm{\Lambda}_{2}$
BE	0.0519	0.0266	-0.0080	-0.0104	0.7799	0.9063	0.0412	-0.8227
WRBE
$\mathbf{\gamma=0.2}$	0.1087	-0.0912	-0.0466	0.0470	0.0908	-0.1270	-0.0441	0.0516
$\mathbf{\gamma=0.4}$	0.1096	-0.0890	-0.0473	0.0514	0.0946	-0.1373	-0.0432	0.0527
$\mathbf{\gamma=0.6}$	0.1090	-0.0920	-0.0412	0.0465	0.0969	-0.1330	-0.0493	0.0557
$\mathbf{\gamma=0.8}$	0.1183	-0.0905	-0.0462	0.0478	0.0986	-0.1351	-0.0451	0.0524
$\mathbf{\gamma=1.0}$	0.1064	-0.0806	-0.0614	0.0467	0.0962	-0.1274	-0.0408	0.0523

Robustness behaviour can be observed through the study of biases of the estimators. Hence, the Bias of MLE and WMDPDE is obtained through Monte Carlo simulation based on 1000 generations where MLE and WMDPDE are obtained using the Co-ordinate Descent Algorithm described in Algorithm (1). The threshold value c is taken as $c=0.0001$ , and the learning rate $\alpha$ is taken in the range $\alpha=(0.001,0.009)$ . The outcomes are reported in Table (4) for different schemes both in pure and contaminated cases.
For the Bayesian estimation, the MH algorithm described in Algorithm (2) is used to generate a sample from the posterior distribution. In proposal distribution, $\delta$ is set as 0.8 where we take $a=0.002$ and $\sigma^{2}=0.002.$ A sequence of 20,000 values is generated applying the MH algorithm and the first 2000 data points are discarded for burn-in, after which every $150^{th}$ sample is taken to reduce the auto-correlation. Thus 121 samples are finally obtained whose average is the posterior estimate. This process is repeated for 700 generations to approximate the biases of BE and WRBE based on both Normal and Dirichlet priors. For Dirichlet prior, we choose $\sigma^{2}_{p}=0.005$ . The outcomes are reported in Table (5).

From Table (4), it is evident that the bias of MLE is less than that of WMDPDE for pure data. When data is contaminated, WMDPDE performs better than MLE as the bias of WMDPDE is less than that of MLE. Also, the change of bias for WMDPDE is lesser than that of MLE changing from pure to contamination scheme. Table (5) shows that the bias of BE is less than that of WRBE in the pure data scheme. But after contamination, there is no significant increase in the bias of WRBE, but the bias of BE is increased. Thus, from Tables (4) and (5), it can be concluded that WMDPDE and WRBE are robust estimators. It is also observed for the simulation scheme 1 ( $\bm{\Lambda}=\bm{\Lambda}_{1}$ ), BE and WRBE based on normal prior have less bias than that of Dirichlet prior and vice versa for the simulation scheme 2 ( $\bm{\Lambda}=\bm{\Lambda}_{2}$ ). Hence the choice of prior depends on the situation at hand. Observing the figures in Tables (4) and (5), it can be concluded that among the four estimation methods; MLE, WMDPDE, BE, and WRBE; WRBE has the smallest bias compared to other estimates whereas WMDPDE and WRBE are robust estimators. Hence if prior information is available, WRBE is the best choice, but if it is not possible to get the prior knowledge, one can rely on WMDPDE for robust estimation purposes.

Table 6: Optimal value of tuning parameter

Tuning	Estimates				$\Phi_{\gamma}(\hat{\bm{\Lambda}})$
parameter	$\hat{a}_{1}$	$\hat{b}_{1}$	$\hat{a}_{2}$	$\hat{b}_{2}$	$\Phi_{\gamma}(\hat{\bm{\Lambda}})$
0.1	0.01530	-0.11309	0.01961	0.30967	0.83786
0.2	0.01150	-0.33993	0.01100	0.06346	0.74776
0.3	0.01069	-0.38879	0.01015	0.03626	0.70562
0.4	0.01444	-0.38966	0.01011	0.03579	0.68433
0.5	0.01274	-0.39045	0.01008	0.03535	0.65819
0.6	0.01157	-0.39118	0.01006	0.03495	0.62569
0.7	0.01079	-0.39184	0.01004	0.03458	0.58414
0.8	0.01027	-0.39246	0.01003	0.03424	0.52944
0.9	0.00943	-0.39302	0.01002	0.03393	0.45566
1.0	0.00973	-0.39354	0.01001	0.03364	0.35466

7.2 Optimal choice of tuning parameter

As seen in section 3, the DPD measure based estimation depends on the choice of tuning parameter $\gamma$ . Hence it becomes necessary to find the optimal value for tuning the parameter concerning the criteria of interest. This work considers a data-driven approach in finding optimal tuning parameter aiming to increase the robustness along with the precision of the estimation. Therefore, the objective function defined in (21) is to be minimised.

\Phi_{\gamma}(\bm{\Lambda})=C_{1}\,D^{w}_{\gamma}({\bm{\Lambda}})+C_{2}\,|V|

(21)

where, $D^{w}_{\gamma}$ is defined in (7), $|V|=det\left[Q_{\gamma}^{-1}({\bm{\Lambda}})R_{\gamma}({\bm{\Lambda}})Q_{\gamma}^{-1}({\bm{\Lambda}})\right]$ and $C_{1},C_{2}$ are predefined positive weight values with $C_{1}+C_{2}=1$ . The goal is to find the tuning parameter that minimises the WDPD measure between the empirical and assumed theoretical distributions and the determinant of the variance-covariance matrix of the estimate of the parameter for the given choices of tuning parameters.

Here we perform some numerical experiments in search of optimal tuning parameter under the set-up given in Table (2) with true parameter value $\bm{\Lambda}=(0.01,-0.4,0.01,0.03)^{{}^{\prime}}$ . For different values of $\gamma$ , WMDPDE are obtained and thereafter $\Phi_{\gamma}(\hat{\bm{\Lambda}})$ is calculated with $C_{1}=C_{2}=0.5.$ Those results are reported in Table (6). From this Table, it is observed that $\gamma=1.0$ is the optimal tuning parameter in this investigated case.

Table 7: Layout of the Data

Groups	Diagnosed	Median age	Inspection Times
	Patients	at diagnosis	(Months)
1	0342	45.76	1	6	12	20
2	1259	55.43	1	6	12	20
3	2179	64.58	1	6	12	20
Groups	Survived		Deaths
	Patients		(Cancer, Other)
1	066		(018,04)	(108,02)	(082,03)	(057,02)
2	247		(101,14)	(421,22)	(268,17)	(163,06)
3	392		(219,19)	(748,31)	(426,16)	(312,16)

Table 8: Estimates of parameters and Bootstrap Bias and RMSE of MLE and WMDPDE for real data

	Estimates				Bootstrap Bias(RMSE)
	$\hat{\bm{a}}_{1}$	$\hat{\bm{b}}_{1}$	$\hat{\bm{a}}_{2}$	$\hat{\bm{b}}_{2}$	$\hat{\bm{a}}_{1}$	$\hat{\bm{b}}_{1}$	$\hat{\bm{a}}_{2}$	$\hat{\bm{b}}_{2}$
MLE	0.0613	0.1604	0.4116	-0.3044	0.0939	0.0102	0.0119	-0.0863
					(0.1125)	(0.0368)	(0.01719)	(0.1253)
WMDPDE
$\mathbf{\gamma=0.2}$	0.0772	0.1680	0.4147	-0.3196	0.0889	0.0134	0.0148	-0.0994
					(0.0890)	(0.0270)	(0.0187)	(0.0947)
$\mathbf{\gamma=0.4}$	0.0613	0.1676	0.4136	-0.3784	0.0419	0.0119	0.0137	-0.0580
					(0.0890)	(0.0270)	(0.0187)	(0.0954)
$\mathbf{\gamma=0.6}$	0.0330	0.1681	0.4066	-0.3464	0.0135	0.0114	0.0066	-0.0260
					(0.0301)	(0.0244)	(0.0037)	(0.0152)
$\mathbf{\gamma=0.8}$	0.0205	0.1629	0.4124	-0.3221	0.0085	0.0057	0.0049	-0.0018
					(0.0016)	(0.0017)	(0.0011)	(0.0015)
$\mathbf{\gamma=1.0}$	0.0195	0.1620	0.4167	-0.3210	0.0016	0.0041	0.0017	-0.0076
					(0.0010)	(0.0012)	(0.0035)	(0.0016)

Table 9: Bootstrap Bias and RMSE of BE and WRBE for real data

	Estimates				Bootstrap Bias(RMSE)
	$\hat{\bm{a}}_{1}$	$\hat{\bm{b}}_{1}$	$\hat{\bm{a}}_{2}$	$\hat{\bm{b}}_{2}$	$\hat{\bm{a}}_{1}$	$\hat{\bm{b}}_{1}$	$\hat{\bm{a}}_{2}$	$\hat{\bm{b}}_{2}$
Normal Prior
BE	0.0120	0.1559	0.3988	-0.3221	-0.0415	0.0069	0.0085	0.0142
					(0.0734)	(0.0094)	(0.0102)	(0.0437)
WRBE
$\mathbf{\gamma=0.2}$	0.0237	0.1631	0.3947	-0.3151	-0.0019	0.0076	-0.0032	-0.0138
					(0.0072)	(0.0083)	(0.0077)	(0.0080)
$\mathbf{\gamma=0.4}$	0.0186	0.1547	0.3938	-0.3233	-0.0013	-0.0054	-0.0061	-0.0032
					(0.0072)	(0.0071)	(0.0097)	(0.0101)
$\mathbf{\gamma=0.6}$	0.0285	0.1657	0.3995	-0.3297	0.0083	0.0058	-0.0004	-0.0098
					(0.0061)	(0.0104)	(0.0083)	(0.0105)
$\mathbf{\gamma=0.8}$	0.0157	0.1622	0.4074	-0.3235	-0.0042	0.0018	0.0072	-0.0033
					(0.0059)	(0.0064)	(0.0112)	(0.0073)
$\mathbf{\gamma=1.0}$	0.0258	0.1518	0.4007	-0.3165	-0.0017	-0.0067	0.0004	-0.0005
					(0.0063)	(0.0091)	(0.0060)	(0.0062)
Dirichlet Prior
BE	0.0281	0.1683	0.4055	-0.3275	0.0075	0.0074	0.0088	0.0417
					(0.0204)	(0.0095)	(0.0624)	(0.0734)
WRBE
$\mathbf{\gamma=0.2}$	0.0219	0.1626	0.4011	-0.3163	0.0013	0.0027	-0.0003	0.0017
					(0.0062)	(0.0074)	(0.0079)	(0.0065)
$\mathbf{\gamma=0.4}$	0.0200	0.1483	0.3951	-0.3175	-0.0001	-0.0105	-0.0054	0.0002
					(0.0079)	(0.0096)	(0.0087)	(0.0086)
$\mathbf{\gamma=0.6}$	0.0161	0.1542	0.4012	-0.3049	-0.0038	-0.0049	0.0003	0.0130
					(0.0209)	(0.0075)	(0.0062)	(0.0070)
$\mathbf{\gamma=0.8}$	0.0225	0.1589	0.4061	-0.3317	0.0021	-0.0008	0.0053	-0.0133
					(0.0041)	(0.0062)	(0.0092)	(0.0069)
$\mathbf{\gamma=1.0}$	0.0174	0.1724	0.3990	-0.3268	-0.0019	-0.0047	-0.0008	-0.0049
					(0.0063)	(0.0077)	(0.0061)	(0.0079)

7.3 Data Analysis

For the real data analysis purposes, data is adopted from the database named “Incidence $\cdot$ SEER Research Data, 8 Registries, Nov 2021 Sub (1975-2019)”[29] which is recorded by National Cancer Institute (Surveillance, Epidemiology, and End Results Program (SEER)). The data is extracted from patients between the periods 2008-2010 who were diagnosed with Pancreas cancer. Further, data on 418 patients is omitted because of the unknown status of their death, cause of death or survival months. Death due to pancreas cancer is taken as competing cause 1, and death due to other causes is taken as competing cause 2. For this analysis, patients are divided into three age-groups $(40-49)$ , $(50-59)$ and $(60-69)$ years. Their median age at diagnosis is taken as the stress levels. Stress levels are multiplied by $0.1$ for computational ease. The patients are observed at the interval of $(1,6,12,20)$ months for each of the groups which are converted to years by dividing it by $12$ . Number of deaths are recorded for each time interval and the number of surviving patients is noted. The data layout is described in Table (7).

Table 10: 95% HPD CRI of the parameter estimates for real data

Normal Prior
	$\hat{\bm{a}}_{1}$		$\hat{\bm{b}}_{1}$		$\hat{\bm{a}}_{2}$		$\hat{\bm{b}}_{2}$
	LL	UL	LL	UL	LL	UL	LL	UL
BE	0.0072	0.0179	0.1466	0.1664	0.3953	0.4009	-0.3271	-0.3162
WRBE
$\mathbf{\gamma=0.2}$	0.0174	0.0292	0.1590	0.1667	0.3902	0.3998	-0.3241	-0.3041
$\mathbf{\gamma=0.4}$	0.0143	0.0246	0.1520	0.1571	0.3877	0.4019	-0.3262	-0.3194
$\mathbf{\gamma=0.6}$	0.0189	0.0346	0.1602	0.1718	0.3939	0.4072	-0.3344	-0.3204
$\mathbf{\gamma=0.8}$	0.0112	0.0215	0.1538	0.1711	0.3999	0.4130	-0.3269	-0.3194
$\mathbf{\gamma=1.0}$	0.0223	0.0289	0.1476	0.1561	0.3964	0.4074	-0.3217	-0.3124
Dirichlet Prior
BE	0.0228	0.0334	0.1595	0.1762	0.3985	0.4130	-0.3323	-0.3230
WRBE
$\mathbf{\gamma=0.2}$	0.0185	0.0254	0.1585	0.1668	0.3974	0.4032	-0.3205	-0.3122
$\mathbf{\gamma=0.4}$	0.0124	0.0265	0.1437	0.1521	0.3902	0.4006	-0.3216	-0.3108
$\mathbf{\gamma=0.6}$	0.0117	0.0220	0.1511	0.1588	0.3947	0.4091	-0.3126	-0.2982
$\mathbf{\gamma=0.8}$	0.0190	0.0258	0.1544	0.1631	0.4015	0.4105	-0.3460	-0.3153
$\mathbf{\gamma=1.0}$	0.0122	0.0235	0.1651	0.1796	0.3946	0.4033	-0.3335	-0.3176

To ensure that the two-parameter Lindley distribution can be fitted to the data, a bootstrap-based goodness of fit test is performed. The distance-based test statistic is given as,

T=\sum_{i=1}^{I}\left\{\left\mid\frac{k_{i}-\hat{k}_{i}}{\hat{k}_{i}}\right\mid+\sum_{l=1}^{L}\sum_{r=1}^{2}\left\mid\frac{n_{ilr}-\hat{n}_{ilr}}{\hat{n}_{ilr}}\right\mid\right\}.

where $\hat{k_{i}}$ and $\hat{n}_{ilr}$ are the expected number of survival and failures, respectively. MLE given in Table (8) is used to obtain the expected number of deaths and survivals. The value of the test statistic came out to be $6.6594$ , and the corresponding approximate p-value is $0.792$ , which strongly satisfies the assumption of Lindley distribution as the lifetime distribution.

For estimation purposes, the initial values are chosen as $a_{1}=0.02,b_{1}=0.16,a_{2}=0.40,b_{2}=-0.32$ , which are found through the grid-search procedure. The estimates based on MLE and WMDPDE with their bootstrap bias and root mean square of errors (RMSE) are reported in Table (8). From this Table, it can be observed that as the tuning parameter increases, the bias of estimates decreases. Also, the estimates based on BE and WRBE with their bootstrap bias and RMSE are given in Table (9). From this Table, it can be observed that bias and RMSE based on Dirichlet prior is less than the corresponding bias and RMSE based on Normal prior in general. From Tables (8) and (9) it can be concluded that WRBE based on Dirichlet prior is desirable among the four methods of estimation for this data set-up. Finally, The 95% HPD CRI of the parameters are reported in Table (10) where LL indicates the lower limit and UL indicates the upper limit of the HPD CRI.

7.4 Missing Cause of Failure Data Analysis

When the death of patients is recorded, but the observer misses the cause of their death, missing cause of failure analysis is applied. Here the missing cause of failure probability (MCFP) is considered constant for all the independent groups and time intervals.

Table 11: Count Data Details (missing cause of failure)

Groups	Diagnosed	Median age	Inspection Times
	Patients	at diagnosis	(Months)
1	0348	45.76	1	6	12	20
2	1276	55.41	1	6	12	20
3	2198	64.59	1	6	12	20
Groups	Survived	Deaths	Deaths
	Patients	(Missing Cause)	(Cancer, Other)
1	066	06	(018,04)	(108,02)	(082,03)	(057,02)
2	247	17	(101,14)	(421,22)	(268,17)	(163,06)
3	392	19	(219,19)	(748,31)	(426,16)	(312,16)

For the study, 42 patients are considered from the previously omitted 418 patients in the SEER dataset whose cause of death could not be recorded. The updated layout of the data is provided in Table (11). For estimation purposes, the initial values taken here are $a_{1}=0.001,b_{1}=0.400,a_{2}=0.050,b_{2}=-0.320,p_{m}=0.10$ , found through extensive grid search. The test statistic’s value for verifying the fitness of Lindley lifetime distribution is obtained as $16.6354$ , and the corresponding approximate p-value came out to be $0.267$ , which suggests that the distribution can be fitted to the given data. The estimates of the parameters are shown in Table (12). The corresponding bootstrap bias with RMSE is presented in Tables (13) and (14). By closely observing the Tables (13) and (14), it can be concluded that estimates based on the Normal prior may be chosen over the other estimates in terms of bias and RMSE.

Table 12: Estimate of parameters for real data (with Missing Cause of failure)

Normal Prior
	Estimates
	$\hat{\bm{a}}_{1}$	$\hat{\bm{b}}_{1}$	$\hat{\bm{a}}_{2}$	$\hat{\bm{b}}_{2}$	$\hat{\bm{p}}_{m}$
MLE	0.0033	0.0703	0.3886	-0.3173	0.0956
WMDPDE
$\mathbf{\gamma=0.2}$	0.0069	0.0574	0.3967	-0.3191	0.0987
$\mathbf{\gamma=0.4}$	0.0033	0.0554	0.3979	-0.3195	0.0992
$\mathbf{\gamma=0.6}$	0.0091	0.0541	0.3985	-0.3197	0.0996
$\mathbf{\gamma=0.8}$	0.0046	0.0830	0.3989	-0.3194	0.0997
$\mathbf{\gamma=1.0}$	0.0023	0.0775	0.3991	-0.3197	0.0998
BE	0.0030	0.0550	0.4269	-0.3181	0.0950
WRBE
$\mathbf{\gamma=0.2}$	0.0041	0.0482	0.4262	-0.3176	0.1017
$\mathbf{\gamma=0.4}$	0.0012	0.0526	0.3990	-0.3212	0.0973
$\mathbf{\gamma=0.6}$	0.0071	0.0585	0.3846	-0.3173	0.1003
$\mathbf{\gamma=0.8}$	0.0065	0.0497	0.3907	-0.3128	0.0993
$\mathbf{\gamma=1.0}$	0.0041	0.0482	0.4262	-0.3176	0.1017
Dirichlet Prior
BE	0.0030	0.0480	0.3592	-0.3171	0.1006
WRBE
$\mathbf{\gamma=0.2}$	0.0070	0.0501	0.4022	-0.3211	0.1024
$\mathbf{\gamma=0.4}$	0.0034	0.0438	0.3977	-0.3197	0.1094
$\mathbf{\gamma=0.6}$	0.0052	0.0526	0.3818	-0.3260	0.0989
$\mathbf{\gamma=0.8}$	0.0023	0.0467	0.3872	-0.3211	0.1006
$\mathbf{\gamma=1.0}$	0.0037	0.0481	0.3722	-0.3166	0.0957

Table 13: Bootstrap Bias and RMSE of MLE and WMDPDE for real data (with Missing Cause of failure)

	Bootstrap Bias (RMSE)
	$\hat{\bm{a}}_{1}$	$\hat{\bm{b}}_{1}$	$\hat{\bm{a}}_{2}$	$\hat{\bm{b}}_{2}$	$\hat{\bm{p}}_{m}$
MLE	0.0276	0.1398	-0.0473	0.0030	-0.0037
	(0.0282)	(0.3296)	(0.2101)	(0.0046)	(0.0064)
WMDPDE
$\mathbf{\gamma=0.2}$	0.0010	0.0072	0.0002	0.0012	-0.0009
	(0.0445)	(0.3387)	(0.2266)	(0.0039)	(0.0045)
$\mathbf{\gamma=0.4}$	0.0050	0.0065	-0.0091	0.0066	-0.0005
	(0.0220)	(0.3478)	(0.2844)	(0.0015)	(0.0010)
$\mathbf{\gamma=0.6}$	0.0011	0.0059	-0.0015	0.0011	-0.0006
	(0.0113)	(0.3484)	(0.2945)	(0.0054)	(0.0026)
$\mathbf{\gamma=0.8}$	0.0054	0.0484	-0.0013	0.0007	-0.0003
	(0.0040)	(0.3486)	(0.3015)	(0.0005)	(0.0028)
$\mathbf{\gamma=1.0}$	0.0022	0.0428	-0.0011	0.0004	-0.0002
	(0.0016)	(0.3488)	(0.3071)	(0.0004)	(0.0023)

Table 14: Bootstrap Bias and RMSE of BE and WRBE for real data (with Missing Cause of failure)

Normal Prior
	Bootstrap Bias (RMSE)
	$\hat{\bm{a}}_{1}$	$\hat{\bm{b}}_{1}$	$\hat{\bm{a}}_{2}$	$\hat{\bm{b}}_{2}$	$\hat{\bm{p}}_{m}$
BE	0.0023	0.0049	0.0259	0.0042	-0.0050
	(0.0043)	(0.0038)	(0.0023)	(0.0056)	(0.0041)
WRBE
$\mathbf{\gamma=0.2}$	0.0038	-0.0016	0.0273	0.0024	0.0017
	(0.0045)	(0.0037)	(0.0362)	(0.0050)	(0.0063)
$\mathbf{\gamma=0.4}$	0.0002	0.0024	-0.0017	-0.0011	-0.0026
	(0.0010)	(0.0057)	(0.0121)	(0.0108)	(0.0039)
$\mathbf{\gamma=0.6}$	0.0062	0.0086	-0.0160	0.0027	0.0005
	(0.0046)	(0.0041)	(0.0359)	(0.0046)	(0.0118)
$\mathbf{\gamma=0.8}$	-0.0033	-0.0003	-0.0105	0.0072	-0.0007
	(0.0041)	(0.0055)	(0.0358)	(0.0057)	(0.0019)
$\mathbf{\gamma=1.0}$	0.0032	-0.0016	0.0274	0.0024	0.0017
	(0.0074)	(0.0036)	(0.0372)	(0.0055)	(0.0037)
Dirichlet Prior
BE	0.0018	-0.0017	-0.0411	0.0028	0.0006
	(0.0041)	(0.0043)	(0.0548)	(0.0038)	(0.0039)
WRBE
$\mathbf{\gamma=0.2}$	0.0061	0.0017	0.0012	-0.0011	0.0025
	(0.0043)	(0.0038)	(0.0228)	(0.0056)	(0.0041)
$\mathbf{\gamma=0.4}$	0.0024	-0.0061	-0.0032	0.0004	0.0091
	(0.0042)	(0.0091)	(0.0036)	(0.0038)	(0.0058)
$\mathbf{\gamma=0.6}$	0.0043	0.0026	-0.0185	-0.0061	-0.0096
	(0.0056)	(0.0054)	(0.0037)	(0.0054)	(0.0061)
$\mathbf{\gamma=0.8}$	0.0010	-0.0028	-0.0126	-0.0012	0.0008
	(0.0095)	(0.0050)	(0.0509)	(0.0043)	(0.0062)
$\mathbf{\gamma=1.0}$	-0.0044	-0.0018	-0.0259	0.0034	-0.0046
	(0.0038)	(0.0052)	(0.0371)	(0.0049)	(0.0044)

8 Conclusion

This work is focused on the development of density power divergence based robust method of estimation both in classical and Bayesian framework under two independent competing causes of failures formulated by Lindley lifetime distribution in the context of one-shot device data analysis. Through extensive simulation experiments, the robustness of the weighted minimum density power divergence estimator and weighted robust Bayes estimator has been proved over the conventional maximum likelihood estimator and Bayesian estimator, respectively. It has also been found that when data is contaminated, the bias of the weighted robust Bayes estimator is the least among the four methods of estimation. But when prior information is not possible to obtain, one can rely on a weighted minimum density power divergence estimator. For both the robust estimators, we have derived the influence function which measures the robustness analytically. As the tuning parameter plays a crucial role in estimation purposes, we have proposed an approach for finding out the optimal value of it aiming to increase robustness and precision in estimation. Some numerical experiments have been done in this regard. Further, the developments of the theoretical results have been extended to the situation when the cause of some of the failures remains unidentified or missing. Finally, the SEER-Pancreas cancer data set has been taken for real-life lifetime data analysis to establish the utility of the theoretical results explained in this work.

The model analyzed here can be implemented assuming other lifetime distributions. This study can be extended to the situation of dependent competing risks. Robust testing of hypotheses in the Bayesian framework can also be developed. Efforts in this direction are in the pipeline and we are optimistic about reporting these findings soon.

Declarations of interest

none.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data availability

Surveillance, Epidemiology, and End Results (SEER) Program (www.seer.cancer.gov) SEER*Stat Database: Incidence - SEER Research Data, 8 Registries, Nov 2021 Sub (1975-2019) - Linked To County Attributes - Time Dependent (1990-2019) Income/Rurality, 1969-2020 Counties, National Cancer Institute, DCCPS, Surveillance Research Program, released April 2022, based on the November 2021 submission.

Appendix A Proof of Result (1)

	$\displaystyle p_{il1}$	$\displaystyle=\int\limits_{\tau_{i(l-1)}}^{\tau_{il}}\int\limits_{t_{1}}^{\infty}f_{i1}(t_{1};\alpha_{i1},\theta_{i1})f_{i2}(t_{2};\alpha_{i2},\theta_{i2})\,dt_{1}dt_{2}=\frac{\theta_{i1}^{2}\theta_{i}^{-3}A^{(p_{1})}_{i}}{\prod_{r=1}^{2}(\alpha_{ir}\theta_{ir}+1)}$
	$\displaystyle p_{il2}$	$\displaystyle=\int\limits_{\tau_{i(l-1)}}^{\tau_{il}}\int\limits_{t_{2}}^{\infty}f_{i2}(t_{2};\alpha_{i2},\theta_{i2})f_{i1}(t_{1};\alpha_{i1},\theta_{i1})\,dt_{2}dt_{1}=\frac{\theta_{i2}^{2}\theta_{i}^{-3}A^{(p_{2})}_{i}}{\prod_{r=1}^{2}(\alpha_{ir}\theta_{ir}+1)}$
	$\displaystyle p_{i0}$	$\displaystyle=P(T_{il1g}>\tau_{iL},T_{il2g}>\tau_{iL})=\prod_{r=1}^{2}\left(\frac{1+\alpha_{ir}\theta_{ir}+\tau_{iL}\theta_{ir}}{\alpha_{ir}\theta_{ir}+1}\right)e^{-\tau_{iL}\theta_{ir}}$

where,

	$\displaystyle A^{(p_{1})}_{i}$	$\displaystyle=\bigg{\{}A_{i}(\alpha_{i1},\theta_{i2})E_{i}^{(0)}+A_{i(l-1)}(\theta_{i2})-A_{il}(\theta_{i2})\bigg{\}}$
	$\displaystyle E_{i}^{(\eta)}$	$\displaystyle=\tau_{i(l-1)}^{\eta}e^{-\tau_{i(l-1)}\theta_{i}}-\tau_{il}^{\eta}e^{-\tau_{il}\theta_{i}}$
	$\displaystyle A^{(p_{2})}_{i}$	$\displaystyle=\bigg{\{}A_{i}(\alpha_{i2},\theta_{i1})E_{i}^{(0)}+A_{i(l-1)}(\theta_{i1})-A_{il}(\theta_{i1})\bigg{\}}\quad;\quad\theta_{i}=\theta_{i1}+\theta_{i2}$
	$\displaystyle A_{i}(\alpha,\theta)$	$\displaystyle=\theta_{i}\{1+\theta_{i}(\alpha+\theta\,\alpha_{i1}\alpha_{i2})\}$
	$\displaystyle A_{i(\mu)}(\theta)$	$\displaystyle=e^{-\tau_{i(\mu)}\theta_{i}}[\tau_{i(\mu)}\{\theta_{i}^{2}+\theta(1+\theta_{i})(2+\alpha_{i}\theta_{i})\}+\theta_{i}^{2}\theta\tau_{i(\mu)}^{2}]$

Appendix B Proof of Result (2)

The set of estimating equations for MLEs is obtained by,

\displaystyle\sum_{i=1}^{I}\left[k_{i}\frac{\partial(ln(p_{i0}))}{\partial\bm{\bm{\Lambda}}}+\sum_{l=1}^{L}\sum_{r=1}^{2}n_{ilr}\frac{\partial(ln(p_{ilr}))}{\partial\bm{\bm{\Lambda}}}\right]=\mathbf{0_{4}}

Here,

	$\displaystyle\frac{\partial(ln(p_{i0}))}{\partial\bm{\bm{\Lambda}}}$	$\displaystyle=\frac{1}{p_{i0}}\,s_{i}\bm{\Theta}_{i}B_{i0}^{(\bm{\bm{\Lambda}})}C_{i}^{(p_{0})}\quad\text{and}\quad\frac{\partial(ln(p_{ilr}))}{\partial\bm{\bm{\Lambda}}}=\frac{1}{p_{ilr}}s_{i}\bm{\Theta}_{i}B_{ilr}^{(\bm{\bm{\Lambda}})}C_{i}^{(p_{r})}$
	$\displaystyle\text{where}\quad\bm{\Theta}_{i}$	$\displaystyle=(\alpha_{i1},\theta_{i1},\alpha_{i2},\theta_{i2})^{{}^{\prime}}\;,\;\bm{\bm{\Lambda}}=(a_{1},b_{1},a_{2},b_{2})^{{}^{\prime}}\;,\;\alpha_{i}=\alpha_{i1}+\alpha_{i2}$

By utilizing the notations from proof of Result (1),

	$\displaystyle C_{i}^{(p_{0})}$	$\displaystyle=\frac{e^{-\tau_{iL}\theta_{i}}}{\prod_{r=1}^{2}(\alpha_{ir}\theta_{ir}+1)}\;,\quad C_{i}^{(p_{r})}=\frac{\theta_{ir}^{2}\theta_{i}^{-3}}{\prod_{r=1}^{2}(\alpha_{ir}\theta_{ir}+1)}\;;\quad B_{i0}^{(a_{1})}=B_{i}^{(1)}\tau_{iL}\theta_{i1}^{2}$
	$\displaystyle B_{i0}^{(a_{2})}$	$\displaystyle=B_{i}^{(2)}\tau_{iL}\theta_{i2}^{2}\;;\;B_{i}^{(1)}=\frac{1+\alpha_{i2}\theta_{i2}+\tau_{iL}\theta_{i2}}{\alpha_{i1}\theta_{i1}+1}\;,\;B_{i}^{(2)}=\frac{1+\alpha_{i1}\theta_{i1}+\tau_{iL}\theta_{i1}}{\alpha_{i2}\theta_{i2}+1}$
	$\displaystyle B_{i0}^{(b_{1})}$	$\displaystyle=B_{i}^{(1)}[B(\alpha_{i1},\theta_{i1})]\quad;\quad B_{i0}^{(b_{2})}=B_{i}^{(2)}[B(\alpha_{i2},\theta_{i2})]$
	$\displaystyle B(\alpha,\theta)$	$\displaystyle=\left[\{(\alpha\theta+1)(\alpha+\tau_{iL})\}-\left\{(1+\alpha\theta+\theta\tau_{iL})(\alpha\theta+\alpha+1)\right\}\right]$

	$\displaystyle B_{il1}^{(a_{1})}$	$\displaystyle=\theta_{i}[G_{i}^{(1)}(\theta_{i2},0)+G_{i}^{(2)}(\theta_{i2},1)]-A_{i}^{(p_{1})}(\alpha_{i1}\theta_{i1}+1)^{-1}$
	$\displaystyle B_{il2}^{(a_{2})}$	$\displaystyle=\theta_{i}[G_{i}^{(1)}(\theta_{i1},0)+G_{i}^{(2)}(\theta_{i1},1)]-A_{i}^{(p_{2})}(\alpha_{i2}\theta_{i2}+1)^{-1}$
	$\displaystyle B_{il1}^{(a_{2})}$	$\displaystyle=\theta_{i}[G_{i}^{(3)}(\alpha_{i1},\theta_{i2},0)+G_{i}^{(4)}(\theta_{i2},1)]-A_{i}^{(p_{1})}(\alpha_{i2}\theta_{i2}+1)^{-1}$
	$\displaystyle B_{il2}^{(a_{1})}$	$\displaystyle=\theta_{i}[G_{i}^{(3)}(\alpha_{i2},\theta_{i1},0)+G_{i}^{(4)}(\theta_{i1},1)]-A_{i}^{(p_{2})}(\alpha_{i1}\theta_{i1}+1)^{-1}$
	$\displaystyle B_{il1}^{(b_{1})}$	$\displaystyle=G_{i}^{(5)}(\alpha_{i1},\theta_{i1},0)+G_{i}^{(6)}(\alpha_{i1},\theta_{i2},1)-G_{i}^{(7)}(\theta_{i2},2)$
		$\displaystyle\qquad-G_{i}^{(8)}(\theta_{i2},3)+A^{(p_{1})}_{i}G_{i}^{(9)}(\alpha_{i1},\theta_{i1})$
	$\displaystyle B_{il2}^{(b_{2})}$	$\displaystyle=G_{i}^{(5)}(\alpha_{i2},\theta_{i2},0)+G_{i}^{(6)}(\alpha_{i2},\theta_{i1},1)-G_{i}^{(7)}(\theta_{i1},2)$
		$\displaystyle\qquad-G_{i}^{(8)}(\theta_{i1},3)+A^{(p_{2})}_{i}G_{i}^{(9)}(\alpha_{i2},\theta_{i2})$
	$\displaystyle B_{il1}^{(b_{2})}$	$\displaystyle=G_{i}^{(10)}(\alpha_{i2},\theta_{i2},0)+G_{i}^{(11)}(\alpha_{i1},\theta_{i2},1)-G_{i}^{(12)}(\theta_{i2},2)$
		$\displaystyle\qquad-G_{i}^{(6)}(\alpha_{i2},\theta_{i2},3)+A^{(p_{1})}_{i}G_{i}^{(13)}(\alpha_{i2},\theta_{i2})$
	$\displaystyle B_{il2}^{(b_{1})}$	$\displaystyle=G_{i}^{(10)}(\alpha_{i1},\theta_{i1},0)+G_{i}^{(11)}(\alpha_{i2},\theta_{i1},1)-G_{i}^{(12)}(\theta_{i1},2)$
		$\displaystyle\qquad-G_{i}^{(6)}(\alpha_{i1},\theta_{i1},3)+A^{(p_{2})}_{i}G_{i}^{(13)}(\alpha_{i1},\theta_{i1})$

	$\displaystyle G^{(1)}_{i}(\theta,\eta)$	$\displaystyle=\theta_{i}(1+\alpha_{i}\theta)E^{(\eta)}_{i}\;;\;G^{(2)}_{i}(\theta,\eta)=\theta(1+\theta_{i})E^{(\eta)}_{i}$
	$\displaystyle G_{i}^{(3)}(\alpha,\theta,\eta)$	$\displaystyle=\alpha\theta\theta_{i}E^{(\eta)}_{i}\;;\;G^{(4)}_{i}(\theta,\eta)=\theta(1+\theta_{i})E^{(\eta)}_{i}$
	$\displaystyle G_{i}^{(5)}(\alpha,\theta,\eta)$	$\displaystyle=\{1+2\theta_{i}(\alpha+\theta\alpha_{i1}\alpha_{i2})\}E^{(\eta)}_{i}$
	$\displaystyle G_{i}^{(6)}(\alpha,\theta,\eta)$	$\displaystyle=\{2\theta_{i}+\theta(2+\alpha_{i}+2\alpha_{i}\theta_{i})-A_{i}(\alpha,\theta)\}E^{(\eta)}_{i}$
	$\displaystyle G_{i}^{(7)}(\theta,\eta)$	$\displaystyle=\{\theta^{2}_{i}+\theta(2+\alpha_{i}\theta_{i}(1+\theta_{i}))\}E^{(\eta)}_{i}\;;\;G_{i}^{(8)}(\theta,\eta)=\theta\theta^{2}_{i}E^{(\eta)}_{i}$
	$\displaystyle G_{i}^{(9)}(\alpha,\theta)$	$\displaystyle=\left\{\frac{2\theta^{-1}+\alpha-3\theta^{-1}(\alpha\theta+1)}{\alpha\theta+1}\right\}$
	$\displaystyle G_{i}^{(10)}(\alpha,\theta,\eta)$	$\displaystyle=\{1+\theta_{i}\alpha_{i}(2+\alpha(\theta_{i}+2\theta))\}E^{(\eta)}_{i}$
	$\displaystyle G_{i}^{(11)}(\alpha,\theta,\eta)$	$\displaystyle=\{2\theta_{i}+\theta(2+\alpha_{i}+2\alpha_{i}\theta_{i})+(1+\theta_{i})(2+\alpha_{i}\theta_{i})-A_{i}(\alpha,\theta)\}E^{(\eta)}_{i}$
	$\displaystyle G_{i}^{(12)}(\theta,\eta)$	$\displaystyle=\theta\{2+\alpha_{i}\theta_{i}(1+\theta_{i})\}E^{(\eta)}_{i}\;;\;G_{i}^{(13)}(\alpha,\theta)=\frac{3\theta^{-1}_{i}(\alpha\theta+1)+\alpha}{\alpha\theta+1}$

Therefore, the following equations are obtained whose solution is the MLEs.

\sum_{i=1}^{I}s_{i}\bm{\Theta}_{i}\left[\frac{k_{i}}{p_{i0}}B_{i0}^{(\bm{\bm{\Lambda}})}C_{i}^{(p_{0})}+\sum_{l=1}^{L}\sum_{r=1}^{2}\frac{n_{ilr}}{p_{ilr}}B_{ilr}^{(\bm{\bm{\Lambda}})}C_{i}^{(p_{r})}\right]=\mathbf{0_{4}}

Appendix C Proof of Theorem (1)

To obtain the alternative expression for the WDPD measure, we denote failure probabilities and survival probability for $i^{th}$ group as,

$p_{1i1}=P_{i1}$
$p_{2i1}=P_{i2}$
$p_{1i2}=P_{i3}$
$p_{2i2}=P_{i4}$
$\dots\;\dots$
$p_{il1}=P_{i(M-2)}$
$p_{il2}=P_{i(M-1)}$
$p_{i0}=P_{iM}$

We have, $h=2(l-1)+r$ and $M=2L+1$ i.e. $p_{ilr}=P_{i[2(l-1)+r]}$ . For the $i^{th}$ group, the number of failures between the interval $(\tau_{i(l-1)}-\tau_{il})$ is denoted as, $N_{il}=n_{il1}+n_{il2}$ . Again, we define, $X_{ui}=(X_{ui1},X_{ui2},\dots,X_{uiM})\sim MN(1,\undertilde{P_{i}})$ ; where, $\undertilde{P_{i}}=(P_{i1},P_{i2},\dots,P_{iM})$ . Therefore, $N_{ih}=\sum_{u_{i}=1}^{g_{i}}X_{u_{i}h}$ . Hence, WDPD measure in equation (7) can be rewritten as,

\displaystyle H^{w}_{\gamma}(\bm{\Lambda})

\displaystyle=\sum_{i=1}^{I}\frac{g_{i}}{G}\left[\sum_{h=1}^{M}(P_{ih})^{\gamma+1}-\frac{\gamma+1}{\gamma}\sum_{h=1}^{M}\frac{N_{ih}}{g_{i}}(P_{ih})^{\gamma}+\frac{1}{\gamma}\sum_{h=1}^{M}\left(\frac{N_{ih}}{g_{i}}\right)^{\gamma+1}\right].

(22)

Based on Calvino et al.[31], the proof of the theorem proceeds as follows.
From equation (22) WDPD measure ignoring the terms independent of parameters is given as,

\displaystyle H_{g}(\gamma)

\displaystyle=\sum_{i=1}^{I}\frac{g_{i}}{G}\left[\sum_{h=1}^{M}(P_{ih})^{\gamma+1}-\frac{\gamma+1}{\gamma}\sum_{h=1}^{M}\frac{N_{ih}}{g_{i}}(P_{ih})^{\gamma}\right]=\sum_{i=1}^{I}\frac{g_{i}}{G}\left[\frac{1}{g_{i}}\sum_{u_{i}=1}^{g_{i}}V_{u_{i}\gamma}(\bm{\Lambda})\right].

where, $V_{u_{i}\gamma}(\bm{\Lambda})=\left\{\sum_{h=1}^{M}P_{ih}^{\gamma+1}-\frac{\gamma+1}{\gamma}\sum_{h=1}^{M}X_{u_{i}h}P_{ih}^{\gamma}\right\}.$

\displaystyle\text{Let,}\;H_{g\bm{\Lambda}}

\displaystyle=\frac{\partial(H_{g}(\gamma))}{\partial\bm{\Lambda}}=\sum_{i=1}^{I}\frac{g_{i}}{G}\left[\frac{1}{g_{i}}\sum_{u_{i}=1}^{g_{i}}\frac{\partial(V_{u_{i}\gamma}(\bm{\Lambda}))}{\partial\bm{\Lambda}}\right]

For the $i^{th}$ group,

\displaystyle\frac{\partial(V_{u_{i}\gamma}(\bm{\Lambda}))}{\partial\bm{\Lambda}}

\displaystyle=\sum_{h=1}^{M}(\gamma+1)P_{ih}^{\gamma}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}}-(\gamma+1)\sum_{h=1}^{M}X_{u_{i}h}P_{ih}^{\gamma-1}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}}

Here, we get,

\displaystyle E\left(\frac{\partial(V_{u_{i}\gamma}(\bm{\Lambda}))}{\partial\bm{\Lambda}}\right)

\displaystyle=0\;,\;Var\left(\frac{\partial(V_{u_{i}\gamma}(\bm{\Lambda}))}{\partial\bm{\Lambda}}\right)=Var(Y)\;;\;Y=\frac{\partial(V_{u_{i}\gamma}(\bm{\Lambda}))}{\partial\bm{\Lambda}}

	$\displaystyle Var(Y)$	$\displaystyle=(\gamma+1)^{2}Var\left(\sum_{h=1}^{M}X_{u_{i}h}P_{ih}^{\gamma-1}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}}\right)$
		$\displaystyle=(\gamma+1)^{2}\left[\sum_{h=1}^{M}P_{ih}^{2(\gamma-1)}\left(\frac{\partial(P_{ih})}{\partial\bm{\Lambda}}\right)^{2}P_{ih}(1-P_{ih})\right.$
		$\displaystyle\qquad\qquad\quad\left.-2\sum_{(h_{1},h_{2})}P_{ih_{1}}^{\gamma}P_{ih_{2}}^{\gamma}\right.\left.\frac{\partial(P_{ih_{1}})}{\partial\bm{\Lambda}}\frac{\partial(P_{ih_{2}})}{\partial\bm{\Lambda}}\right]$

(Since for multinomial distribution,
$Var(X_{u_{i}h})=P_{ih}(1-P_{ih})$ and $Cov(X_{u_{i}h_{1}},X_{u_{i}h_{2}})=-P_{ih_{1}}P_{ih_{2}}$ )

	$\displaystyle Cov(Y_{j},Y_{f})$	$\displaystyle=(\gamma+1)^{2}\left[\sum_{h=1}^{M}P_{ih}^{2(\gamma-1)}\left(\frac{\partial(P_{ih_{1}})}{\partial\bm{\Lambda}_{j}}\right)\left(\frac{\partial(P_{ih_{2}})}{\partial\bm{\Lambda}_{f}}\right)P_{ih}(1-P_{ih})\right.$
		$\displaystyle\qquad\qquad\quad\left.-2\sum_{(h_{1},h_{2})}P_{ih_{1}}^{\gamma}P_{ih_{2}}^{\gamma}\left(\frac{\partial(P_{ih_{1}})}{\partial\bm{\Lambda}_{j}}\right)\left(\frac{\partial(P_{ih_{2}})}{\partial\bm{\Lambda}_{f}}\right)\right]$

Therefore, $Cov\left(\frac{\partial(V_{u_{i}\gamma})}{\partial\bm{\Lambda}}(\bm{\Lambda})\right)=(\gamma+1)^{2}R_{i\gamma}(\bm{\Lambda})\;;$ where variance term is
$R_{i\gamma}(\bm{\Lambda})_{jj}=\frac{Var(Y)}{(\gamma+1)^{2}}$ and covariance term is $R_{i\gamma}(\bm{\Lambda})_{jf}=\frac{Cov(Y_{j},Y_{f})}{(\gamma+1)^{2}}$ . Hence,

	$\displaystyle Y=\frac{\partial(V_{u_{i}\gamma}(\bm{\Lambda}))}{\partial\bm{\Lambda}}\sim N\left(\mathbf{0},(\gamma+1)^{2}R_{i\gamma}(\bm{\Lambda})\right)$
	$\displaystyle\implies\frac{1}{g_{i}}\sum_{u_{i}=1}^{g_{i}}\frac{\partial(V_{u_{i}\gamma}(\bm{\Lambda}))}{\partial\bm{\Lambda}}\sim N\left(\bm{0},\frac{(\gamma+1)^{2}}{g_{i}}R_{i\gamma}(\bm{\Lambda})\right)$
	$\displaystyle H_{g\bm{\Lambda}}=\frac{\partial(H_{g}(\gamma))}{\partial\bm{\Lambda}}\sim N\left(\bm{0},\frac{(\gamma+1)^{2}}{G^{2}}\sum_{i=1}^{I}g_{i}R_{i\gamma}(\bm{\Lambda})\right).$

Now, we define, $T_{\gamma}=-\sqrt{G}H_{g\bm{\Lambda}}=-\sqrt{G}\frac{\partial(H_{g}(\gamma))}{\partial\bm{\Lambda}}$

\displaystyle\implies T_{\gamma}\sim N\left(\bm{0},(\gamma+1)^{2}\sum_{i=1}^{I}\frac{g_{i}}{G}R_{i\gamma}(\bm{\Lambda})\right)

(23)

	$\displaystyle\text{For,}\;\frac{\partial^{2}V_{u_{i}\gamma}(\bm{\Lambda})}{\partial\bm{\Lambda}_{f}\partial\bm{\Lambda}_{j}}$	$\displaystyle=\left[\sum_{h=1}^{M}\left\{(\gamma+1)\gamma P_{ih}^{\gamma-1}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{j}}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{f}}+(\gamma+1)P_{ih}^{\gamma}\,\frac{\partial^{2}(P_{ih})}{\partial\bm{\Lambda}_{j}\partial\bm{\Lambda}_{f}}\right\}\right]$
		$\displaystyle\qquad-\left[(\gamma+1)\sum_{h=1}^{M}X_{u_{i}h}(\gamma-1)P_{ih}^{\gamma-2}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{j}}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{f}}\right]$
		$\displaystyle\qquad-\left[(\gamma+1)\sum_{h=1}^{M}X_{u_{i}h}P_{ih}^{\gamma-1}\frac{\partial^{2}(P_{ih})}{\partial\bm{\Lambda}_{j}\partial\bm{\Lambda}_{f}}\right]$

Since, $\frac{1}{k_{i}}\sum_{u_{i}=1}^{g_{i}}X_{u_{i}h}\xrightarrow{p}P_{ih}$ .
Therefore, $\frac{1}{g_{i}}\sum_{u_{i}=1}^{g_{i}}\frac{\partial^{2}V_{u_{i}\gamma}(\bm{\Lambda})}{\partial\bm{\Lambda}_{f}\partial\bm{\Lambda}_{j}}\xrightarrow{p}(\gamma+1)\sum_{h=1}^{M}P_{ih}^{\gamma-1}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{j}}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{f}}$ .

Hence,

\displaystyle\frac{\partial(H_{g\bm{\Lambda}})}{\partial\bm{\Lambda}_{f}}=\frac{\partial^{2}(H_{g}(\gamma))}{\partial\bm{\Lambda}_{f}\partial\bm{\Lambda}_{j}}=\xrightarrow{p}\sum_{i=1}^{I}\frac{g_{i}}{G}(\gamma+1)\sum_{h=1}^{M}P_{ih}^{\gamma-1}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{j}}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{f}}

Consider $\bm{\Lambda}^{0}$ to be the true value of parameters, then using the Taylor series expansion ignoring the higher order terms,

	$\displaystyle H_{g\bm{\Lambda}}=H_{g\bm{\Lambda}^{0}}+\sum_{f=1}^{4}\left.\frac{\partial(H_{g\bm{\Lambda}})}{\partial\bm{\Lambda}_{f}}\right\|_{\bm{\Lambda=\Lambda_{0}}}(\hat{\bm{\Lambda}}_{f}-\hat{\bm{\Lambda}}_{f}^{0})+\frac{1}{2}\sum_{j=1}^{4}\sum_{f=1}^{4}\left.\frac{\partial^{2}(H_{g\bm{\Lambda}})}{\partial\bm{\Lambda}_{j}\partial\bm{\Lambda}_{f}}\right\|_{\bm{\Lambda=\Lambda_{0}}}(\hat{\bm{\Lambda}}_{j}-\hat{\bm{\Lambda}}^{0}_{j})(\hat{\bm{\Lambda}}_{f}-\hat{\bm{\Lambda}}^{0}_{f})$
	$\displaystyle\text{Since,}\quad H_{g\hat{\bm{\Lambda}}_{\gamma}}=\bm{0}\quad\text{and therefore,}$
	$\displaystyle-\sqrt{G}H_{g\bm{\Lambda}^{0}}=\sqrt{G}\sum_{f=1}^{4}(\hat{\bm{\Lambda}}_{f}-\hat{\bm{\Lambda}}^{0}_{f})\left[\left.\frac{\partial(H_{g\bm{\Lambda}})}{\partial\bm{\Lambda}_{f}}\right\|_{\bm{\Lambda=\Lambda_{0}}}+\frac{1}{2}\sum_{j=1}^{4}\sum_{f=1}^{4}\left.\frac{\partial^{2}(H_{g\bm{\Lambda}})}{\partial\bm{\Lambda}_{j}\partial\bm{\Lambda}_{f}}\right\|_{\bm{\Lambda=\Lambda_{0}}}(\hat{\bm{\Lambda}}_{j}-\hat{\bm{\Lambda}}^{0}_{j})\right]$		(24)

Then,

	$\displaystyle A_{(j,f)}$	$\displaystyle\xrightarrow{p}\sum_{i=1}^{I}\frac{g_{i}}{G}(\gamma+1)\sum_{h=1}^{M}P_{ih}^{\gamma-1}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{j}}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{f}}\;,\;\text{where}\,\;\frac{g_{i}}{G}\;\text{is finite.}$
	$\displaystyle\implies A_{\gamma}$	$\displaystyle\xrightarrow{p}(\gamma+1)Q_{\gamma}(\bm{\Lambda}^{0})\,\;\text{when}\;g_{i}\to\infty.$		(25)

where, $A_{\gamma}$ is the $4\times 4$ matrix with (j,f)th elements as $A_{(j,f)}$ where $A_{(j,f)}=\left.\frac{\partial(H_{g\bm{\Lambda}})}{\partial\bm{\Lambda}_{f}}\right|_{\bm{\Lambda=\Lambda_{0}}}+\frac{1}{2}\sum_{j=1}^{4}\sum_{f=1}^{4}\left.\frac{\partial^{2}(H_{g\bm{\Lambda}})}{\partial\bm{\Lambda}_{j}\partial\bm{\Lambda}_{f}}\right|_{\bm{\Lambda=\Lambda_{0}}}(\hat{\bm{\Lambda}}_{j}-\hat{\bm{\Lambda}}^{0}_{j})$ and
$Q_{\gamma}(\bm{\Lambda}^{0})=\left[\left(\sum_{i=1}^{I}\frac{g_{i}}{G}\sum_{h=1}^{M}P_{ih}^{\gamma-1}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{j}}\frac{\partial(P_{ih})}{\partial\bm{\Lambda}_{f}}\right)_{j,f}\right]$ . Let, $Z_{f}=\sqrt{G}(\hat{\bm{\Lambda}}_{f}-\bm{\Lambda}^{0}_{f})$ . Then by equation (25) in equation (24),

	$\displaystyle-\sqrt{G}H_{g\bm{\Lambda}^{0}}=Z_{f}A_{(j,f)}\implies T_{\gamma}=Z_{\gamma}A_{\gamma}\quad(\text{from equation}\;\eqref{29})\implies Z_{\gamma}=A_{\gamma}^{-1}T_{\gamma}$
	$\displaystyle\sqrt{G}(\hat{\bm{\Lambda}}_{\gamma}-\bm{\Lambda}^{0})=A_{\gamma}^{-1}T_{\gamma}\sim N\left(\bm{0},Q_{\gamma}(\bm{\Lambda}^{0})^{-1}R_{\gamma}(\bm{\Lambda}^{0})Q_{\gamma}(\bm{\Lambda}^{0})^{-1}\right)\;;\;R_{\gamma}(\bm{\Lambda}^{0})=\sum_{i=1}^{I}\frac{g_{i}}{G}R_{i\gamma}(\bm{\Lambda}).$

Appendix D Proof of Result (4)

The functional $S_{\gamma}(M)$ satisfies,

	$\displaystyle\sum_{i=1}^{I}\frac{g_{i}}{G}$	$\displaystyle\left[\left\{p^{\gamma}_{i0}\frac{\partial(p_{i0})}{\partial\bm{\Lambda}}+\sum_{l=1}^{L}\sum_{r=1}^{2}p^{\gamma}_{ilr}\frac{\partial(p_{ilr})}{\partial\bm{\Lambda}}\right\}\right.$
		$\displaystyle\left.-\left\{\left(\int_{I_{i0}}dM\right)p^{\gamma-1}_{i0}\frac{\partial(p_{i0})}{\partial\bm{\Lambda}}+\sum_{l=1}^{L}\sum_{r=1}^{2}\left(\int_{I_{ilr}}dM\right)p^{\gamma-1}_{ilr}\frac{\partial(p_{ilr})}{\partial\bm{\Lambda}}\right\}\right]=0$

Replacing $M$ by $M=(1-\epsilon)F_{\bm{\Lambda}}+(\epsilon)\delta_{I_{A}}(\bm{t})$ and differentiating with respect to $\epsilon$ at $\epsilon\to 0^{+}$ we get the desired result.
where, $\delta_{I_{A}}(\bm{t})=\begin{cases}1\quad\text{if}\ \bm{t}\in I_{A}\\ 0\quad\text{otherwise}.\\ \end{cases}$

Appendix E Proof of Result (5)

\displaystyle IF(t;T^{(\gamma)}_{G},F_{\bm{\Lambda}})

\displaystyle=\left.\frac{\partial}{\partial\epsilon}T^{(\gamma)}_{G}(M_{\epsilon})\right|_{\epsilon\to 0^{+}}=Cov_{(p)}\left(\bm{\Lambda},X_{\gamma}(\bm{\Lambda};\bm{t},f_{\bm{\Lambda}})\right)

where, $Cov_{(p)}()$ is the covariance for posterior distribution and

	$\displaystyle X_{\gamma}(\bm{\Lambda};t,f_{\bm{\Lambda}})$	$\displaystyle=\left.\frac{\partial}{\partial\epsilon}\{B^{w}_{\gamma}(\bm{\Lambda};M_{\epsilon},F_{\bm{\Lambda}})\}\right\|_{\epsilon\to 0^{+}}$
		$\displaystyle=\frac{1}{\gamma}\sum_{i=1}^{I}\frac{g_{i}}{G}\left[\bigg{\{}\big{(}\delta_{I_{i0}}(t)-p_{i0}\big{)}p_{i0}^{\gamma}\bigg{\}}+\left\{\sum_{l=1}^{L}\sum_{r=1}^{2}\big{(}\delta_{I_{ilr}}(t)-p_{ilr}\big{)}p^{\gamma}_{ilr}\right\}\right].$

References

[1] T.-H. Fan, N. Balakrishnan, C.-C. Chang, The bayesian approach for highly reliable electro-explosive devices using one-shot device testing, Journal of Statistical Computation and Simulation 79 (9) (2009) 1143–1154. doi:https://doi.org/10.1080/00949650802142592.
[2] N. Balakrishnan, M. H. Ling, Em algorithm for one-shot device testing under the exponential distribution, Computational Statistics & Data Analysis 56 (3) (2012) 502–509. doi:https://doi.org/10.1016/j.csda.2011.09.010.
[3] N. Balakrishnan, M. H. Ling, Multiple-stress model for one-shot device testing data under exponential distribution, IEEE Transactions on Reliability 61 (3) (2012) 809–821. doi:10.1109/TR.2012.2208301.
[4] N. Balakrishnan, M. H. Ling, Gamma lifetimes and one-shot device testing analysis, Reliability Engineering & System Safety 126 (2014) 54–64. doi:https://doi.org/10.1016/j.ress.2014.01.009.
[5] M. J. Crowder, Classical competing risks, CRC Press, 2001. doi:https://doi.org/10.1201/9781420035902.
[6] C. Zhang, Y. Shi, M. Wu, Statistical inference for competing risks model in step-stress partially accelerated life tests with progressively type-i hybrid censored weibull life data, Journal of computational and applied mathematics 297 (2016) 65–74. doi:https://doi.org/10.1016/j.cam.2015.11.002.
[7] S. Dutta, H. K. T. Ng, S. Kayal, Inference for a general family of inverted exponentiated distributions under unified hybrid censoring with partially observed competing risks data, Journal of Computational and Applied Mathematics 422 (2023) 114934. doi:https://doi.org/10.1016/j.cam.2022.114934.
[8] L. Wang, Inference of progressively censored competing risks data from kumaraswamy distributions, Journal of Computational and Applied Mathematics 343 (2018) 719–736. doi:https://doi.org/10.1016/j.cam.2018.05.013.
[9] X. Bai, Y. Shi, H. K. T. Ng, Y. Liu, Inference of accelerated dependent competing risks model for marshall–olkin bivariate weibull distribution with nonconstant parameters, Journal of Computational and Applied Mathematics 366 (2020) 112398. doi:https://doi.org/10.1016/j.cam.2019.112398.
[10] N. Balakrishnan, H. Y. So, M. H. Ling, Em algorithm for one-shot device testing with competing risks under exponential distribution, Reliability Engineering & System Safety 137 (2015) 129–140. doi:https://doi.org/10.1016/j.ress.2014.12.014.
[11] N. Balakrishnan, H. Y. So, M. H. Ling, Em algorithm for one-shot device testing with competing risks under weibull distribution, IEEE Transactions on Reliability 65 (2) (2015) 973–991. doi:10.1109/TR.2015.2500361.
[12] N. Balakrishnan, H. Y. So, M. H. Ling, A bayesian approach for one-shot device testing with exponential lifetimes under competing risks, IEEE Transactions on Reliability 65 (1) (2015) 469–485. doi:10.1109/TR.2015.2440235.
[13] D. V. Lindley, Fiducial distributions and bayes’ theorem, Journal of the Royal Statistical Society. Series B (Methodological) (1958) 102–107doi:https://www.jstor.org/stable/2983909.
[14] R. Plackett, Introduction to probability and statistics from a bayesian viewpoint (1966). doi:https://doi.org/10.2307/3614870.
[15] M. E. Ghitany, B. Atieh, S. Nadarajah, Lindley distribution and its application, Mathematics and computers in simulation 78 (4) (2008) 493–506. doi:https://doi.org/10.1016/j.matcom.2007.06.007.
[16] J. Mazucheli, J. A. Achcar, The lindley distribution applied to competing risks lifetime data, Computer methods and programs in biomedicine 104 (2) (2011) 188–192. doi:https://doi.org/10.1016/j.cmpb.2011.03.006.
[17] A. Basu, I. R. Harris, N. L. Hjort, M. Jones, Robust and efficient estimation by minimising a density power divergence, Biometrika 85 (3) (1998) 549–559. doi:https://doi.org/10.1016/j.cmpb.2011.03.006.
[18] N. Balakrishnan, E. Castilla, N. Martín, L. Pardo, Robust estimators for one-shot device testing data under gamma lifetime model with an application to a tumor toxicological data, Metrika 82 (8) (2019) 991–1019. doi:https://doi.org/10.1007/s00184-019-00718-5.
[19] N. Balakrishnan, E. Castilla, N. Martín, L. Pardo, Robust inference for one-shot device testing data under weibull lifetime model, IEEE transactions on Reliability 69 (3) (2019) 937–953. doi:10.1109/TR.2019.2954385.
[20] N. Balakrishnan, E. Castilla, N. Martín, L. Pardo, Robust inference for one-shot device testing data under exponential lifetime model with multiple stresses, Quality and Reliability Engineering International 36 (6) (2020) 1916–1930. doi:https://doi.org/10.1002/qre.2665.
[21] N. Balakrishnan, E. Castilla, N. Martin, L. Pardo, Power divergence approach for one-shot device testing under competing risks, Journal of Computational and Applied Mathematics 419 (2023) 114676. doi:https://doi.org/10.1016/j.cam.2022.114676.
[22] N. Balakrishnan, E. Castilla, Robust inference for destructive one-shot device test data under weibull lifetimes and competing risks, Journal of Computational and Applied Mathematics 437 (2024) 115452. doi:https://doi.org/10.1016/j.cam.2023.115452.
[23] G. Hooker, A. N. Vidyashankar, Bayesian model robustness via disparities, Test 23 (2014) 556–584. doi:https://doi.org/10.1007/s11749-014-0360-z.
[24] A. Ghosh, A. Basu, Robust bayes estimation using the density power divergence, Annals of the Institute of Statistical Mathematics 68 (2016) 413–437. doi:https://doi.org/10.1007/s10463-014-0499-0.
[25] J. Warwick, M. Jones, Choosing a robustness tuning parameter, Journal of Statistical Computation and Simulation 75 (7) (2005) 581–588. doi:https://doi.org/10.1080/00949650412331299120.
[26] A. Ghosh, A. Basu, Robust estimation for non-homogeneous data and the selection of the optimal tuning parameter: the density power divergence approach, Journal of Applied Statistics 42 (9) (2015) 2056–2072. doi:https://doi.org/10.1080/02664763.2015.1016901.
[27] S. Basak, A. Basu, M. Jones, On the ‘optimal’density power divergence tuning parameter, Journal of Applied Statistics 48 (3) (2021) 536–556. doi:https://doi.org/10.1080/02664763.2020.1736524.
[28] E. Castilla, P. J. Chocano, On the choice of the optimal tuning parameter in robust one-shot device testing analysis, Trends in Mathematical, Information and Data Sciences: A Tribute to Leandro Pardo (2022) 169–180doi:https://doi.org/10.1007/978-3-031-04137-2-16.
[29] Surveillance, epidemiology, and end results (seer) program (www.seer.cancer.gov) seer*stat database: Incidence - seer research data, 8 registries, nov 2021 sub (1975-2019) - linked to county attributes - time dependent (1990-2019) income/rurality, 1969-2020 counties, national cancer institute, dccps, surveillance research program, released april 2022, based on the november 2021 submission.
[30] W. Lu, Y. Liang, Analysis of competing risks data with missing cause of failure under additive hazards model, Statistica Sinica (2008) 219–234doi:https://www.jstor.org/stable/24308254.
[31] A. Calvino, N. Martin, L. Pardo, Robustness of minimum density power divergence estimators and wald-type test statistics in loglinear models with multinomial sampling, Journal of Computational and Applied Mathematics 386 (2021) 113214. doi:https://doi.org/10.1016/j.cam.2020.113214.
[32] L. Greco, W. Racugno, L. Ventura, Robust likelihood functions in bayesian inference, Journal of Statistical Planning and Inference 138 (5) (2008) 1258–1270. doi:https://doi.org/10.1016/j.jspi.2007.05.001.
[33] B.-E. Chérief-Abdellatif, P. Alquier, Mmd-bayes: Robust bayesian estimation via maximum mean discrepancy, in: Symposium on Advances in Approximate Bayesian Inference, PMLR, 2020, pp. 1–21. doi:https://proceedings.mlr.press/v118/cherief-abdellatif20a.html.
[34] J. Jewson, S. Jim Q., H. Chris, Principles of bayesian inference using general divergence criteria., Entropy 20 (6) (2018) 442. doi:https://doi.org/10.3390/e20060442.
[35] T. Nakagawa, H. Shintaro, Robust bayesian inference via $\gamma$ -divergence, Communications in Statistics-Theory and Methods 49 (2) (2020) 343–360. doi:https://doi.org/10.1080/03610926.2018.1543765.
[36] H. L. Lee, C. Morris A., A multinomial logit model for the spatial distribution of hospital utilization, Journal of Business & Economic Statistics 3 (2) (1985) 159–168. doi:10.1080/07350015.1985.10509445.
[37] W. K. Hastings, Monte carlo sampling methods using markov chains and their applications, Biometrika 57 (1) (1970) 97–109. doi:https://doi.org/10.1093/biomet/57.1.97.
[38] F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, W. A. Stahel, Robust statistics: the approach based on influence functions, John Wiley & Sons, 2011. doi:10.1002/9781118186435.