This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Bayes Factor of Zero Inflated Models under Jeffereys Prior

Paramahansa Pramanik 111e-mail: [email protected] 222Department of Mathematics and Statistics, University of South Alabama, 411 North University Boulevard, Mobile, AL 36688, USA.     Arnab Kumar Maity 333email:[email protected]444Pfizer Inc., 10777 Science Center Drive, San Diego, CA 92121, USA.

Abstract.

Microbiome omics data including 16S rRNA reveal intriguing dynamic associations between the human microbiome and various disease states. Drastic changes in microbiota can be associated with factors like diet, hormonal cycles, diseases, and medical interventions. Along with the identification of specific bacteria taxa associated with diseases, recent advancements give evidence that metabolism, genetics, and environmental factors can model these microbial effects. However, the current analytic methods for integrating microbiome data are fully developed to address the main challenges of longitudinal metagenomics data, such as high-dimensionality, intra-sample dependence, and zero-inflation of observed counts. Hence, we propose the Bayes factor approach for model selection based on negative binomial, Poisson, zero-inflated negative binomial, and zero-inflated Poisson models with non-informative Jeffreys prior. We find that both in simulation studies and real data analysis, our Bayes factor remarkably outperform traditional Akaike information criterion and Vuong’s test. A new R package BFZINBZIP has been introduced to do simulation study and real data analysis to facilitate Bayesian model selection based on the Bayes factor.

Key words:

Negative binomial distribution; Zero inflated negative binomial distribution, Poisson distribution, zero inflated poisson distribution; Bayes factor; non-informative Jeffreys prior; Microbiome.

1 Introduction

The human microbiome consistes of the collection of estimated 3.0×10133.0\times 10^{13} (Sender et al.,, 2016) bacteria and 3.3×1063.3\times 10^{6} genes (Qin et al.,, 2010; Jiang et al.,, 2021). The human microbiome analysis impose a drastic impact on human health and disease (Ursell et al.,, 2012). In recent years, microbiome studies have been successfully identified disease-associated bacteria taxa in type 2 diabetes (Karlsson et al.,, 2013), liver cirrhosis (Qin et al.,, 2014), inflammatory bowl disease (Halfvarson et al.,, 2017; Kakkat et al.,, 2023), and melanoma patients responsive to cancer immunotherapy (Frankel et al.,, 2017; Jiang et al.,, 2021; Dasgupta et al.,, 2023; Hertweck et al.,, 2023; Khan et al.,, 2023; Vikramdeo et al.,, 2023). Quantification of of the human microbiome usually being proceeded by 16s rRNA sequencing or metagenomic shotgun sequencing, where sequence read counts are often summarized into a taxa count table (Jiang et al.,, 2023). Bioinformatics tools like quantitative insights into microbial ecology (QIIME) and mothur are used for analyzing raw 16S rRNA sequencing data (Jovel et al.,, 2016; Zhang and Yi,, 2020). In this literature the word taxa means operational taxonomic units or other taxonomic or functional groups of bacterial sequences (Jiang et al.,, 2023; Altaweel et al.,, 2022). Although innovations in sequencing technology continue to prosper in microbiome studies, the statistical methods used in this field fail to catch up with these advanced sequences (Jiang et al.,, 2021). For example, metagenomic shotgun sequencing generates an increasingly large amount of sequence reads which give species or confine level taxonomic resolution (Segata et al.,, 2012). The subsequent statistical analysis compares whether a species is linked to a phenotypic state or experimental condition (Jiang et al.,, 2021; Polansky and Pramanik,, 2021; Maity et al., 2018b, ).

One commonly used statistical approach in microbiome community involves comparing multiple taxa (Chen et al.,, 2012; Kelly et al.,, 2015; Wu et al.,, 2016; Jiang et al.,, 2021). These approaches do not identify differentially abundant species, which makes clinical interpretation, mechanistic insights, and biological validations difficult (Jiang et al.,, 2021; Pramanik and Polansky, 2023a, ). An alternative approach is to interrogate each individual bacterial taxa for different groups or conditions. La Rosa et al., (2015) use Wilcoxon rank sum or Kruskal-Wallis tests for groupwise comparisons on microbiome compositional data. In more recent years, RNA-seq methods have been adapted to microbiome studies, such as the negative-binomial regression model in DESeq2 (Love et al.,, 2014) and overdispersed Poisson model in edgeR (Robinson et al.,, 2010; Maity et al., 2018a, ). However, these approaches are not optimized for microbiome datasets (Jiang et al.,, 2021). Microbial abundance is influenced by covariates like metabolites, antibiotics, and host genetics (Pramanik, 2022a, ; Pramanik, 2022b, ; Pramanik, 2023c, ). To account for these confounding variables, the association between microbiome and clinical confounders must be quantified. Pairwise correlations between all taxa and covariates are commonly used, but this method may be underpowered (Kinross et al.,, 2011; Maier et al.,, 2018; Zhu et al.,, 2018). Other approaches have been proposed to detect covariate-taxa associations, but these ignore the taxon-outcome associations (Pramanik, 2023a, ; Pramanik, 2023b, ). Recently, Li et al., (2018); Schweizer et al., (2022) developed a multivariate zero-inflated logistic normal model to quantify the associations between microbiome abundances and multiple factors based on microbiome compositional data instead of the count data (Jiang et al.,, 2021; Maity et al.,, 2020; Altaweel et al.,, 2019).

Recent studies have investigated the relationship between diseases and the human microbiome over time. For example, Vatanen et al., (2016) followed 222 infants from birth to age 3 to study their gut microbiome development and its associations with the increasing incidence of autoimmune diseases. Additionally, Romero et al., (2014) compared the vaginal microbiota of pregnant women who delivered preterm versus those who delivered at term. Longitudinal metagenomic count data is often overdispersed (Pramanik,, 2016; Hua et al.,, 2019), and sparse (Pramanik, 2021b, ). Two main categories exist for handling these data sets: the first involves logarithmic or other transformations on count data, followed by usage of linear mixed models to analyze the transformed data (Benson et al.,, 2010; La Rosa et al.,, 2014; Leamy et al.,, 2014; Wang et al.,, 2015). The second category involves zero-inflated Gaussian mixed models to address sparsity issues in longitudinal metagenomics data (Zhang and Yi,, 2020). The first method performs poorly under certain conditions and fails to address the sparcity issue (O’Hara and Kotze,, 2010). On the other hand, zero-inflated Gaussian mixed models successfully address the sparsity issue and can be used to analyze both transformed and untransformed metagenomic data (Zhang and Yi,, 2020; Pramanik,, 2020; Pramanik, 2021c, ; Pramanik and Polansky,, 2021). The second category is generalized linear mixed models, which enable direct analysis of longitudinal metagenomic count data. Metagenomic count data can typically be analyzed similarly to RNA-Seq data, assuming a negative binomial distribution (Pramanik and Polansky,, 2020; Zhang and Yi,, 2020; Pramanik, 2021a, ; Pramanik and Polansky, 2023b, ).

Here, we propose a Bayesian integrative approach of computing Bayes factor to analyze microbiome count data (Polansky and Pramanik,, 2021). Our approach jointly identifies differential abundant taxa among two groups of women (i.e., pregnant and non-pregnant). The data includes 16S rRNA gene sequence based vaginal microbiota from which samples are collected from each subject over intervals of weeks, resulting in 143 taxa and N = 900 longitudinal samples (139 measurements from pregnant women and 761 measurements from non-pregnant women.) To do our experiment we have used Romero et al., (2014) and Jiang et al., (2023) data sets. Count data with large number of zeros (i.e. zero-inflation) are encountered in different fields such as medicine (Böhning et al.,, 1999), public health (Zhou and Tu,, 2000; Maity et al., 2021a, ; Maity et al.,, 2020), environmental sciences (Agarwal et al.,, 2002), agriculture (Hall,, 2000), manufacturing studies(Lambert,, 1992), Orange-crowned Warblers in ponderosa (Garay et al.,, 2011; Maity and Paul,, 2022; White and Bennetts,, 1996). Zero-inflation, a common exemplification of overdispersion, refers to the incidence of zero counts is relatively higher than usual (Garay et al.,, 2011). Since, zero counts frequently have special status in statistical literature, this definitely leads us do research in this area. For example, a production engineer might count the number of defective items selected at random from a production process (Bayarri et al.,, 2008). If overdispersion in raw data is caused by zero-inflation, then zero-inflated Poisson (ZIP) model provides a standard framework for fitting the data (Garay et al.,, 2011; Lambert,, 1992; Maity and Basu,, 2023). According to Ghosh and Samanta, (2002) when some production processes are in absolute states, zero defects occure more frequently (Bayarri et al.,, 2008). An approach to address this issue is to use a two-parameter distribution so that the extra parameter permits a larger variance (Bhattacharya et al.,, 2008). Double exponential family approach, a two-parametric modification of a standard one-parameter exponential family, has been developed which allows more variability than permitted by the single-parameter version (Bhattacharya et al.,, 2008; Efron,, 1986). This is reasonable in count data distributions, such as Poisson, but not useful to model data inflated with zeros (Bhattacharya et al.,, 2008). Fundamental idea of ZIP model is to mix a distribution degenerate at zero with a Poisson distribution (Garay et al.,, 2011). In other words, ZIP assumes that a population consists of two individual types whereas the first type gives a zero count and the second type gives a Poisson-distributed count (Ridout et al.,, 2001; Maity et al., 2019b, ; Maity,, 2016).

If a data set with zero-inflation exhibits overdispersion, a zero-inflated negative binomial (ZINB) model, mixture of a distribution degenerate at zero with a baseline negative binomial distribution, over the ZIP model (Garay et al.,, 2011). Since overdispersion is a ramification of excess zeros, the result has excess variability and ZIP model might not a good fit for such data (Garay et al.,, 2011). A multivariate random-parameter ZINB regression model for modeling crash counts has been developed in Dong et al., (2014). A score test for conducting hypothesis testing of ZIP regression models versus ZINB has been performed inRidout et al., (2001); Paul et al., (2018). A ZINB framework with a Gaussian process has been introduced by Li et al., (2021) to analyze spatial transcriptomics data in which analysis was conducted under a Bayesian framework (Nam et al.,, 2022; Calvo et al.,, 2023). Jiang et al. (Jiang et al.,, 2021; Maity et al., 2019a, ) have been used a ZINB regression model to perform an integrative analysis on microbiome data (Nam et al.,, 2022). In Nam et al., (2022) a statistical inference has been discussed for a zero-inflated binomial distribution with an objective Bayesian and frequentist approaches to determine a point and an interval estimators of the model parameters. Furthermore, a hypothesis testing for excessive zeros in a zero-inflated binomial distribution have been performed and finally, a Monte Carlo simulation is utilized to investigate the performance of estimation and hypothesis testing procedures (Nam et al.,, 2022).

Since the baseline Poisson fails to incorporate the remaining overdispersion not accounted for through zero-inflation and negative binomial models are more flexible than their Poisson counterparts in dealing with overdispersion, ZIP model is not a good fit for such data (Garay et al.,, 2011; Lawless,, 1987; Maity and Dey,, 2018; Maity and Paul,, 2023; Maity et al., 2021c, ; Maity et al., 2021b, ). Moreover, it is a well known fact that the ZIP parameter estimators can be significantly biased under overdispersion of non-zero counts in relation to Poisson distribution (Garay et al.,, 2011). Although, there is a large interest in testing of the presence of overdispersion on a given dataset, our main concentration in this paper is on those circumstances where the data exhibits overdispersion. Furthermore, in this paper we discuss Bayesian methodologies when a negative binomial (NB), ZINB, Poisson or ZIP is fitted to the dataset. We investigate the effectiveness of our theoretical results through simulation and real data analysis based on Romero et al., (2014); Ghosh et al., (2023) and Jiang et al., (2023) data sets. We have introduced a new R package BFZINBZIP to facilitate model selection from Poisson, NB, ZIP, and ZINB distributions.

A popular method to determine the estimates of parameters is to maximize the likelihood or natural logarithm of the likelihood with various Bayesian approaches (Maity and Paul,, 2022; Sommerhalder et al.,, 2023). For example, a Poisson scale representation of NB with Gamma distribution as the mixing density has been discussed in Burrell, (1990); Roy Sarkar et al., (2019); Beck et al., (2023); Maity, (2022), a polynomial expansion and a power series expansion have been considered in Bradlow et al., (2002) and Bhattacharya et al., (2008) respectively. However a little has been given on the Bayesian analysis regarding ZIP versus ZINB models. In particular, to the best of our knowledge, no work exists on posterior analysis under non-informative prior analysis with above two models. We further compare our data driven results within Poisson versus NB, Poisson versus ZIP, NB versus ZINB and ZIP versus ZINB models.

Let 𝐘=[Y1,Y2,,Yn]\mathbf{Y}=[Y_{1},Y_{2},...,Y_{n}]^{\prime} be a vector of observed count data such that each of the elements are independent and identically distributed, where represents transposition of the vector. If 𝐘\mathbf{Y} follows an NB distribution then for all positive γ\gamma and κ\kappa, the probability density function (pdf) is defined as

fNB(y|κ)=Γ(y+γ)y!Γ(γ)(1+κγ)γ(1+γκ)y,y=0,1,2,,f^{NB}(y|\kappa)=\frac{\Gamma{(y+\gamma)}}{y!\ \Gamma{(\gamma)}}\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\left(1+\frac{\gamma}{\kappa}\right)^{-y},\ \ y=0,1,2,...,

or, if 𝐘\mathbf{Y} follows a ZINB then for all α[0,1]\alpha\in[0,1] the pdf is

fZINB(y|α,κ)={α+(1α)(1+κγ)γ,if y=0,(1α)Γ(y+γ)y!Γ(γ)(1+κγ)γ(1+γκ)y,if y=1,2,,f^{ZINB}(y|\alpha,\kappa)=\begin{cases}\alpha+(1-\alpha)\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma},&\text{if $y=0$,}\\ (1-\alpha)\frac{\Gamma{(y+\gamma)}}{y!\ \Gamma{(\gamma)}}\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\left(1+\frac{\gamma}{\kappa}\right)^{-y},&\text{if $y=1,2,...$},\end{cases}

with mean 𝔼(𝐘)=(1α)κ\mathbb{E}(\mathbf{Y})=(1-\alpha)\kappa and variance V(𝐘)=(1α)κ(1ακ+κ/γ)V(\mathbf{Y})=(1-\alpha)\kappa(1-\alpha\kappa+\kappa/\gamma), where α\alpha is represented as the zero-inflation parameter. On the other hand, if 𝐘\mathbf{Y} follows a ZIP distribution then for θ>0\theta>0 and a zero-inflation parameter α^[0,1]\hat{\alpha}\in[0,1] the pdf is

fZIP(y|α^,θ)=α^𝟙(y=0)+(1α^)fP(y|θ),y=0,1,2,,f^{ZIP}(y|\hat{\alpha},\theta)=\hat{\alpha}\mathbbm{1}(y=0)+(1-\hat{\alpha})f^{P}(y|\theta),\ \ y=0,1,2,...,

where 𝟙(.)\mathbbm{1}(.) be an indicator function and fP(y|θ)f^{P}(y|\theta) is the Poisson density so that

fP(y|θ)=exp(θ)θyy!,y=0,1,2,.f^{P}(y|\theta)=\frac{\exp(-\theta)\ \theta^{y}}{y!},\ \ y=0,1,2,....

Many researches have been performed using ZIP distributions with and without covariates to model count data (Bayarri et al.,, 2008). For instance, Lambert, (1992) and (Ghosh and Samanta,, 2002) have been used frequentist and Bayesian approaches respectively to explore industrial data sets through a ZIP regression model (Bayarri et al.,, 2008). A Bayesian score test has been developed in Bhattacharya et al., (2008) to test the null hypothesis 0:α^0\mathcal{H}_{0}:\hat{\alpha}\leq 0 against the alternative hypothesis 1:α^>0\mathcal{H}_{1}:\hat{\alpha}>0 (Bayarri et al.,, 2008). As frequentist approach of score test has been explained in (Deng and Paul,, 2000, 2005; Van den Broek,, 1995), α^\hat{\alpha} is permitted to be negative in Bhattacharya et al., (2008), as long as α^+(1α^)exp(θ)0\hat{\alpha}+(1-\hat{\alpha})\exp(-\theta)\geq 0.

In a Bayesian framework we are interested in testing

𝔐0P:YiiidfP(yi|θ),\displaystyle\mathfrak{M}_{0}^{P}:Y_{i}\overset{\mathrm{iid}}{\sim}f^{P}(y_{i}|\theta),\ \ versus𝔐1NB:YiiidfNB(yi|κ),i=1,,n,\displaystyle\text{versus}\ \ \mathfrak{M}_{1}^{NB}:Y_{i}\overset{\mathrm{iid}}{\sim}f^{NB}(y_{i}|\kappa),\ i=1,...,n, (1)
𝔐0P:YiiidfP(yi|θ),\displaystyle\mathfrak{M}_{0}^{P}:Y_{i}\overset{\mathrm{iid}}{\sim}f^{P}(y_{i}|\theta),\ \ versus𝔐1ZIP:YiiidfZIP(yi|α^,θ),i=1,,n,\displaystyle\text{versus}\ \ \mathfrak{M}_{1}^{ZIP}:Y_{i}\overset{\mathrm{iid}}{\sim}f^{ZIP}(y_{i}|\hat{\alpha},\theta),\ i=1,...,n, (2)
𝔐0NB:YiiidfNB(yi|κ),\displaystyle\mathfrak{M}_{0}^{NB}:Y_{i}\overset{\mathrm{iid}}{\sim}f^{NB}(y_{i}|\kappa),\ \ versus𝔐1ZINBYiiidfZINB(yi|α,κ),i=1,,n,\displaystyle\text{versus}\ \ \mathfrak{M}_{1}^{ZINB}Y_{i}\overset{\mathrm{iid}}{\sim}f^{ZINB}(y_{i}|\alpha,\kappa),\ i=1,...,n, (3)
𝔐0ZIP:YiiidfZIP(yi|α^,θ),\displaystyle\mathfrak{M}_{0}^{ZIP}:Y_{i}\overset{\mathrm{iid}}{\sim}f^{ZIP}(y_{i}|\hat{\alpha},\theta),\ \ versus𝔐1ZINBYiiidfZINB(yi|α,κ),i=1,,n,\displaystyle\text{versus}\ \ \mathfrak{M}_{1}^{ZINB}Y_{i}\overset{\mathrm{iid}}{\sim}f^{ZINB}(y_{i}|\alpha,\kappa),\ i=1,...,n, (4)

where fP,fNB,fZIPf^{P},f^{NB},f^{ZIP} and fZINBf^{ZINB} are the densities of Poisson, NB, ZIP and ZINB distributions, respectively, and 𝔐k[.]:Yi\mathfrak{M}_{k}^{[.]}:Y_{i} has density f[.](yi|Θk),Θk={α,α^,κ,θ}f^{[.]}(y_{i}|\Theta_{k}),\Theta_{k}=\{\alpha,\hat{\alpha},\kappa,\theta\} with Θk\Theta_{k} being the parameters in model 𝔐k[.]\mathfrak{M}_{k}^{[.]} for all k=0,1k=0,1 and [.][.] represents any of Poisson, NB, ZIP and ZINB distributions based on the testing of hypotheses in 1-4.

The article is structured as follows: we first discuss convergence properties of the posterior distribution in Section 2. Next, we determine objective priors for the four distributions previously mentioned. Finally, we compute Bayes factors for hypotheses 1-4 and evaluate model performance on simulated data in Section 3. Two real-data analyses are presented in Section 4, and our conclusions are in Section 5.

2 Framework

The Bayesian methodology for choosing between 𝔐0[.]\mathfrak{M}_{0}^{[.]} and 𝔐1[.]\mathfrak{M}_{1}^{[.]} is determined by assessing the prior probabilities of each model, the prior distributions for the model parameters, and then by computing the posterior probabilities of each 𝔐k[.]\mathfrak{M}_{k}^{[.]} for all k=0,1k=0,1 (Bayarri et al.,, 2008). The posterior probabilities are calculated from the prior distributions and the Bayes Factor, a ratio of maximum likelihood for 𝔐0[.]\mathfrak{M}_{0}^{[.]} and 𝔐1[.]\mathfrak{M}_{1}^{[.]} which is standard method in Bayesian testing and model selection and is associated with Schwarz Bayesian information criterion (BIC) (Bayarri et al.,, 2008; Maity and Paul,, 2022). Most of the times, due to scarcity of resources or lack of time, it is impossible to assess all the priors diligently in a subjective manner (Berger,, 2006). In this environment, objective Bayesian approach based upon non-external information (other than constructing the problem) gives a competent answer (Bayarri et al.,, 2008; Berger,, 2006).

2.1 An overview of the Bayes factor in model selection

Let there be K+1K+1 models 𝔐0[.],.,𝔐K[.]\mathfrak{M}_{0}^{[.]},....,\mathfrak{M}_{K}^{[.]} so that k=0,1,,Kk=0,1,...,K and K>0K>0, and these models contend with each other in determining the most relevant model. If model 𝔐k[.]\mathfrak{M}_{k}^{[.]} holds, then for a parametric space 𝚯k\bm{\Theta}_{k} of Θk\Theta_{k} such that Θk𝚯k\Theta_{k}\subseteq\bm{\Theta}_{k}, 𝒫Θk\mathcal{P}_{\Theta_{k}} is a probability measure on a measurable space (𝐘,𝒜)(\mathbf{Y},\mathcal{A}) such that for each A𝒜A\in\mathcal{A}, Θk𝒫Θk(A)\Theta_{k}\mapsto\mathcal{P}_{\Theta_{k}}(A) is Borel measurable (Ghosh and Ramamoorthi,, 2010), YiY_{i} follows a parametric distribution with pdf f[.](yi|Θk),Θk={α,α^,γ,κ,θ}f^{[.]}(y_{i}|\Theta_{k}),\Theta_{k}=\{\alpha,\hat{\alpha},\gamma,\kappa,\theta\}. It is convenient that Y1,Y2,Y_{1},Y_{2},... as the coordinate random variable defined on the sample space Ω=(𝐘,𝒜)\Omega=\left(\mathbf{Y}^{\infty},\mathcal{A}^{\infty}\right) and 𝒫Θk\mathcal{P}_{\Theta_{k}}^{\infty} as the i.i.d. product measure defined on Ω\Omega (Ghosh and Ramamoorthi,, 2010). Define the space Ωn:=(𝐘n,𝒜n)\Omega_{n}:=\left(\mathbf{Y}^{n},\mathcal{A}^{n}\right) and 𝒫Θkn\mathcal{P}_{\Theta_{k}}^{n} be the nn-fold product of 𝒫Θk\mathcal{P}_{\Theta_{k}}. Bayesian model selection proceeds by choosing a prior density π(Θk|𝔐k[.])\pi\left(\Theta_{k}|\mathfrak{M}_{k}^{[.]}\right) under model 𝔐k[.]\mathfrak{M}_{k}^{[.]} for a set of parameters Θk\Theta_{k}, and the prior model probability π~(𝔐k[.])\tilde{\pi}\left(\mathfrak{M}_{k}^{[.]}\right) of 𝔐k[.]\mathfrak{M}_{k}^{[.]} before data 𝐲\mathbf{y} are observed so that 𝐲=[y1,,yn]\mathbf{y}=[y_{1},...,y_{n}]^{\prime}. Therefore, the marginal or predictive likelihood corresponding to model 𝔐k[.]\mathfrak{M}_{k}^{[.]} is defined as

m(𝐲|𝔐k[.]):=Θkf[.](𝐲|Θk,𝔐k[.])π(Θk|𝔐k[.])𝑑Θk,m\left(\mathbf{y}\big{|}\mathfrak{M}_{k}^{[.]}\right):=\int_{\Theta_{k}}f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\pi\left(\Theta_{k}|\mathfrak{M}_{k}^{[.]}\right)d\Theta_{k},

where f[.](𝐲|Θk,𝔐k[.])f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right) is the likelihood function under model 𝔐k[.]\mathfrak{M}_{k}^{[.]}. Therefore, the posterior probability under the assumption that model 𝔐k[.]\mathfrak{M}_{k}^{[.]} is true can written by the following expression

p(𝔐k[.]|𝐲)=m(𝐲|𝔐k[.])π~(𝔐k[.])k¯=0Km(𝐲|𝔐k¯[.])π~(𝔐k¯[.])m(𝐲|𝔐k[.])π~(𝔐k[.]).p\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right)=\frac{m\left(\mathbf{y}\big{|}\mathfrak{M}_{k}^{[.]}\right)\tilde{\pi}\left(\mathfrak{M}_{k}^{[.]}\right)}{\sum_{\bar{k}=0}^{K}m\left(\mathbf{y}\big{|}\mathfrak{M}_{\bar{k}}^{[.]}\right)\tilde{\pi}\left(\mathfrak{M}_{\bar{k}}^{[.]}\right)}\propto m\left(\mathbf{y}\big{|}\mathfrak{M}_{k}^{[.]}\right)\tilde{\pi}\left(\mathfrak{M}_{k}^{[.]}\right).
Definition 1.

For all n, let p(𝔐k[.]|𝐲)p\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right) be a posterior probability for given values y1,y2,,yny_{1},y_{2},...,y_{n}. The sequence
{p(𝔐k[.]|𝐲)}i=1n\left\{p\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right)\right\}_{i=1}^{n} is said to be consistent at Θ¯k\bar{\Theta}_{k} if there exists a ΩΩ\Omega^{*}\subset\Omega with 𝒫Θ¯k=1\mathcal{P}_{\bar{\Theta}_{k}}^{\infty}=1 so that if ωΩ\omega\in\Omega^{*}, then for every neighborhood 𝒩\mathcal{N} of Θ¯k\bar{\Theta}_{k},

p𝒩(𝔐k[.]|𝐲(ω))1.p_{\mathcal{N}}\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}(\omega)\right)\rightarrow 1.
Remark 1.

When the metric space 𝚯k:={Θk:ρ~(Θk,Θ¯k)<1/n:n1}\bm{\Theta}_{k}:=\left\{\Theta_{k}:\tilde{\rho}\left(\Theta_{k},\bar{\Theta}_{k}\right)<1/n:n\geq 1\right\} constructs a base for the neighborhood of Θ¯k\bar{\Theta}_{k}, and therefore it can be allowed to bet set of measure 1 to depend upon 𝒩\mathcal{N} (Ghosh and Ramamoorthi,, 2010). Hence, it is enough to show that for each 𝒩\mathcal{N} of Θ¯k\bar{\Theta}_{k}, p𝒩(𝔐k[.]|𝐲(ω))1p_{\mathcal{N}}\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}(\omega)\right)\rightarrow 1 almost everywhere of 𝒫Θ¯k\mathcal{P}_{\bar{\Theta}_{k}}^{\infty}.

Lemma 1.

Let {𝒫Θk:Θk𝚯k}\left\{\mathcal{P}_{\Theta_{k}}:\Theta_{k}\in\bm{\Theta}_{k}\right\} be a probability measure dominated by a σ\sigma-finite measure μ\mu and f[.](yi|Θk)f^{[.]}(y_{i}\big{|}\Theta_{k}) be the density of 𝒫Θk\mathcal{P}_{\Theta_{k}}. Assume Θ¯k\bar{\Theta}_{k} be an interior point of 𝚯k\bm{\Theta}_{k} (i.e. Θ¯k=𝚯ko\bar{\Theta}_{k}=\bm{\Theta}_{k}^{o}) and π(Θ0|𝔐0[.])\pi\left(\Theta_{0}\big{|}\mathfrak{M}_{0}^{[.]}\right), π(Θ1|𝔐1[.])\pi\left(\Theta_{1}\big{|}\mathfrak{M}_{1}^{[.]}\right) be two continuous prior densities w.r.t. a measure ζ\zeta. If posterior densities p(𝔐k[.]|𝐲)p\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right), k=0,1k=0,1 are both consistent at Θ¯k\bar{\Theta}_{k} then

limn𝚯k|p(𝔐0[.]|𝐲)p(𝔐1[.]|𝐲)|dζ(Θk)=0,almost sure𝒫Θ¯k.\lim_{n\rightarrow\infty}\int_{\bm{\Theta}_{k}}\left|p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}\right)-p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}\right)\right|d\zeta(\Theta_{k})=0,\ \text{almost sure}\ \mathcal{P}_{\bar{\Theta}_{k}}.
Proof.

See the Appendix. ∎

Lemma 2.

Let p(𝔐0[.]|𝐲)p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}\right) and p(𝔐1[.]|𝐲)p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}\right) be posterior probabilities for give values of y1,y2,,yny_{1},y_{2},...,y_{n} such that for some constants Ψk\Psi_{k}, φk\varphi_{k}, k=0,1,k=0,1,

1p(𝔐k[.]|𝐲)=Ψk𝐲φkexp(𝐲2/2)(1+o(1)),as 𝐲,1-p\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right)=\Psi_{k}\ \mathbf{y}^{\varphi_{k}}\exp(-\mathbf{y}^{2}/2)(1+o(1)),\ \text{as $\mathbf{y}\rightarrow\infty$,}

Then their convolution can be written as

1p(𝔐0[.]|𝐲)p(𝔐1[.]|𝐲)=πΨ0Ψ12(φ1+φ2)𝐲φ1+φ2+1exp(𝐲2/4)(1+o(1)),1-p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}\right)*p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}\right)=\sqrt{\pi}\Psi_{0}\Psi_{1}2^{-(\varphi_{1}+\varphi_{2})}\mathbf{y}^{\varphi_{1}+\varphi_{2}+1}\exp(-\mathbf{y}^{2}/4)(1+o(1)),

as 𝐲\mathbf{y}\rightarrow\infty.

Proof.

See the Appendix. ∎

Along the way of Ghosh and Ramamoorthi, (2010) we will discuss Bernstein-von Mises theorem on the asymptotic normality of the posterior distribution p(𝔐k[.]|𝐲)p\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right). If a consistent global likelihood estimator Θkn\Theta_{k}^{n} exists, then under differentiability it is easy to verify that for all 𝒫Θk\mathcal{P}_{\Theta_{k}}, it is a consistent solution of the likelihood equation a.s. 𝒫Θk\mathcal{P}_{\Theta_{k}} (Ghosh and Ramamoorthi,, 2010). To show the consistency of the posterior distribution we need the following assumption 1 of the density function f[.](yi|Θk)f^{[.]}\left(y_{i}\big{|}\Theta_{k}\right).

Assumption 1.

(i). For model 𝔐k[.]\mathfrak{M}_{k}^{[.]}, {yi:f[.](yi|Θk)>0,i=1,,n}\left\{y_{i}:f^{[.]}\left(y_{i}\big{|}\Theta_{k}\right)>0,\ i=1,...,n\right\} takes the same value for all Θk𝚯k\Theta_{k}\in\bm{\Theta}_{k}.
(ii). Suppose the likelihood function under model 𝔐k[.]\mathfrak{M}_{k}^{[.]} is defined as f[.](𝐲|Θk,𝔐k[.]):=lnf[.](𝐲|Θk)f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right):=\ln f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k}\right) for all 𝐲={y1,,yn}\mathbf{y}=\{y_{1},...,y_{n}\} is thrice differentiable with respect to Θk\Theta_{k} in the neighborhood of (Θ¯kδ,Θ¯k+δ)(\bar{\Theta}_{k}-\delta,\bar{\Theta}_{k}+\delta) so that

f1[.](𝐲|Θk,𝔐k[.]):=Θkf[.](𝐲|Θk,𝔐k[.]);\displaystyle f_{1}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right):=\frac{\partial}{\partial\Theta_{k}}f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right);\ f2[.](𝐲|Θk,𝔐k[.]):=2Θk2f[.](𝐲|Θk,𝔐k[.]);\displaystyle f_{2}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right):=\frac{\partial^{2}}{\partial\Theta_{k}^{2}}f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right);
and,f3[.](𝐲|Θk,𝔐k[.]):=\displaystyle\text{and,}\ f_{3}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right):= 3Θk3f[.](𝐲|Θk,𝔐k[.]).\displaystyle\frac{\partial^{3}}{\partial\Theta_{k}^{3}}f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right).

Then the expectations at Θ¯k\bar{\Theta}_{k} corresponding to likelihood are 𝔼Θ¯k[f1[.](𝐲|Θk,𝔐k[.])]<\mathbb{E}_{\bar{\Theta}_{k}}\left[f_{1}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\right]<\infty and 𝔼Θ¯k[f2[.](𝐲|Θk,𝔐k[.])]<\mathbb{E}_{\bar{\Theta}_{k}}\left[f_{2}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\right]<\infty with

supΘk(Θ¯kδ,Θ¯k+δ)|f3[.](𝐲|Θk,𝔐k[.])|<L(𝐲);and,𝔼Θ¯k(L)<.\sup_{\Theta_{k}\in(\bar{\Theta}_{k}-\delta,\bar{\Theta}_{k}+\delta)}\left|f_{3}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\right|<L(\mathbf{y});\ \text{and,}\ \mathbb{E}_{\bar{\Theta}_{k}}(L)<\infty.

(iii). After interchange the order of expectation w.r.t. Θ¯k\bar{\Theta}_{k} and differentiating f[.](𝐲|Θk,𝔐k[.])f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right) w.r.t. Θ¯k\bar{\Theta}_{k} such that, 𝔼Θ¯k[f1[.](𝐲|Θk,𝔐k[.])]=0\mathbb{E}_{\bar{\Theta}_{k}}\left[f_{1}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\right]=0 and 𝔼Θ¯k[f2[.](𝐲|Θk,𝔐k[.])]=𝔼Θ¯k[f1[.](𝐲|Θk,𝔐k[.])]2\mathbb{E}_{\bar{\Theta}_{k}}\left[f_{2}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\right]=-\mathbb{E}_{\bar{\Theta}_{k}}\left[f_{1}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\right]^{2}.
(iv). Fisher information set (Θ¯k):=𝔼Θ¯k[f1[.](𝐲|Θk,𝔐k[.])]2>0\mathcal{I}\left(\bar{\Theta}_{k}\right):=\mathbb{E}_{\bar{\Theta}_{k}}\left[f_{1}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\right]^{2}>0.
(v). Define fn[.](𝐲|Θk,𝔐k[.]):=i=1nf[.](yi|Θk,𝔐k[.])f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right):=\sum_{i=1}^{n}f^{[.]}\left(y_{i}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right). Then for all δ>0\delta>0, there ϵ>0\exists\ \epsilon>0 so that

𝒫Θ¯k{sup|ΘΘ¯k|>δn1[fn[.](𝐲|Θk,𝔐k[.])fn[.](𝐲|Θ¯k,𝔐k[.])]ϵ}1.\mathcal{P}_{\bar{\Theta}_{k}}\left\{\sup_{\left|\Theta-\bar{\Theta}_{k}\right|>\delta}n^{-1}\left[f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\bar{\Theta}_{k},\mathfrak{M}_{k}^{[.]}\right)\right]\leq-\epsilon\right\}\rightarrow 1.

(vi). The prior density π(Θk|𝔐k[.])\pi\left(\Theta_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right) under model 𝔐k[.]\mathfrak{M}_{k}^{[.]} is Lebesgue measurable, continuous and positive at Θ¯k\bar{\Theta}_{k}.

Proposition 1.

For model 𝔐k[.]\mathfrak{M}_{k}^{[.]} consider the density {f[.](𝐲|Θk);Θk𝚯k}\left\{f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k}\right);\ \Theta_{k}\in\bm{\Theta}_{k}\right\} for all k=1,,Kk=1,...,K satisfies Assumption 1. Let p(τ,𝔐k[.]|𝐲)p\left(\tau,\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right) be the posterior density of τ=n[ΘkΘkn(𝐲)]\tau=\sqrt{n}\left[\Theta_{k}-\Theta_{k}^{n}(\mathbf{y})\right] under model 𝔐k[.]\mathfrak{M}_{k}^{[.]}. Then

|p(τ,𝔐k[.]|𝐲)(Θ¯k)2πexp{12τ2(Θ¯k)}|dτ𝒫Θ¯k0.\int_{\mathbb{R}}\left|p\left(\tau,\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right)-\sqrt{\frac{\mathcal{I}\left(\bar{\Theta}_{k}\right)}{2\pi}}\exp\left\{-\frac{1}{2}\tau^{2}\mathcal{I}\left(\bar{\Theta}_{k}\right)\right\}\right|d\tau\overset{\mathcal{P}_{\bar{\Theta}_{k}}}{\to}0.
Proof.

See the Appendix. ∎

For a given data set 𝐲\mathbf{y} the model with the largest posterior probability is the most favorable model (Nam et al.,, 2022). Moreover, the Bayes factor for model 𝔐k[.]\mathfrak{M}_{k}^{[.]} with respect to 𝔐l[.]\mathfrak{M}_{l}^{[.]} can be expressed as

βkl(𝐲)m(𝐲|𝔐k[.])m(𝐲|𝔐l[.])=Θkf[.](𝐲|Θk,𝔐k[.])π(Θk|𝔐k[.])𝑑ΘkΘlf[.](𝐲|Θl,𝔐l[.])π(Θl|𝔐l[.])𝑑Θl.\beta_{kl}(\mathbf{y})\triangleq\frac{m\left(\mathbf{y}\big{|}\mathfrak{M}_{k}^{[.]}\right)}{m\left(\mathbf{y}\big{|}\mathfrak{M}_{l}^{[.]}\right)}=\frac{\int_{\Theta_{k}}f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\pi\left(\Theta_{k}|\mathfrak{M}_{k}^{[.]}\right)d\Theta_{k}}{\int_{\Theta_{l}}f^{[.]}\left(\mathbf{y}\big{|}\Theta_{l},\mathfrak{M}_{l}^{[.]}\right)\pi\left(\Theta_{l}|\mathfrak{M}_{l}^{[.]}\right)d\Theta_{l}}.

Although we have four models corresponding to Poisson, NB, ZIP and ZINP distributions, we are going to test two models at a time as written in hypotheses 1-4. Therefore, throughout this paper K=1K=1 (hence, K+1=2K+1=2). The Bayes factor of model 𝔐1[.]\mathfrak{M}_{1}^{[.]} with respect to 𝔐0[.]\mathfrak{M}_{0}^{[.]} is

β10(𝐲)m(𝐲|𝔐1[.])m(𝐲|𝔐0[.])=Θ1f[.](𝐲|Θ1,𝔐1[.])π(Θ1|𝔐1[.])𝑑Θ1Θ0f[.](𝐲|Θ0,𝔐0[.])π(Θ0|𝔐0[.])𝑑Θ0.\beta_{10}(\mathbf{y})\triangleq\frac{m\left(\mathbf{y}\big{|}\mathfrak{M}_{1}^{[.]}\right)}{m\left(\mathbf{y}\big{|}\mathfrak{M}_{0}^{[.]}\right)}=\frac{\int_{\Theta_{1}}f^{[.]}\left(\mathbf{y}\big{|}\Theta_{1},\mathfrak{M}_{1}^{[.]}\right)\pi\left(\Theta_{1}|\mathfrak{M}_{1}^{[.]}\right)d\Theta_{1}}{\int_{\Theta_{0}}f^{[.]}\left(\mathbf{y}\big{|}\Theta_{0},\mathfrak{M}_{0}^{[.]}\right)\pi\left(\Theta_{0}|\mathfrak{M}_{0}^{[.]}\right)d\Theta_{0}}.

Since each hypothesis consists of two models, we have π~(𝔐1[.])=1π~(𝔐0[.])\tilde{\pi}\left(\mathfrak{M}_{1}^{[.]}\right)=1-\tilde{\pi}\left(\mathfrak{M}_{0}^{[.]}\right) and

p(𝔐0[.]|𝐲)=11+β10(𝐲)π~(𝔐1[.])π~(𝔐0[.]).p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}\right)=\frac{1}{1+\beta_{10}(\mathbf{y})\frac{\tilde{\pi}\left(\mathfrak{M}_{1}^{[.]}\right)}{\tilde{\pi}\left(\mathfrak{M}_{0}^{[.]}\right)}}.

Furthermore, we choose model 𝔐0[.]\mathfrak{M}_{0}^{[.]} as the true model if

p(𝔐0[.]|𝐲)=11+β10(𝐲)π~(𝔐1[.])π~(𝔐0[.])>12β10(𝐲)π~(𝔐1[.])π~(𝔐0[.])<1β10(𝐲)<1,p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}\right)=\frac{1}{1+\beta_{10}(\mathbf{y})\frac{\tilde{\pi}\left(\mathfrak{M}_{1}^{[.]}\right)}{\tilde{\pi}\left(\mathfrak{M}_{0}^{[.]}\right)}}>\frac{1}{2}\implies\beta_{10}(\mathbf{y})\frac{\tilde{\pi}\left(\mathfrak{M}_{1}^{[.]}\right)}{\tilde{\pi}\left(\mathfrak{M}_{0}^{[.]}\right)}<1\implies\beta_{10}(\mathbf{y})<1,

and choose model 𝔐1[.]\mathfrak{M}_{1}^{[.]} as true model if β10(𝐲)>1\beta_{10}(\mathbf{y})>1. Following Kass and Vaidyanathan, (1992) and Wasserman, (2000) using non-informative prior (will be discussed in the next section) yields a general interpretation of Bayes factors as given in Table 1.

Table 1: Bayes Factors interpretation based upon Jeffreys’ Prior.
Bayes Factors with their meanings.
Bayes Factor Description
β10(𝐲)<110\beta_{10}(\mathbf{y})<\frac{1}{10} Strong presence of model 𝔐0[.]\mathfrak{M}_{0}^{[.]}.
110β10(𝐲)<13.2\frac{1}{10}\leq\beta_{10}(\mathbf{y})<\frac{1}{3.2} Moderate presence of model 𝔐0[.]\mathfrak{M}_{0}^{[.]}.
13.2β10(𝐲)<1\frac{1}{3.2}\leq\beta_{10}(\mathbf{y})<1 Weak presence of model 𝔐0[.]\mathfrak{M}_{0}^{[.]}.
1β10(𝐲)<3.21\leq\beta_{10}(\mathbf{y})<3.2 Weak presence of model 𝔐1[.]\mathfrak{M}_{1}^{[.]}.
3.2β10(𝐲)<103.2\leq\beta_{10}(\mathbf{y})<10 Moderate presence of model 𝔐1[.]\mathfrak{M}_{1}^{[.]}.
β10(𝐲)>10\beta_{10}(\mathbf{y})>10 Strong presence of model 𝔐1[.]\mathfrak{M}_{1}^{[.]}.

2.2 Objective priors in models with Poisson, NB, ZIP and ZINB distributions

A severe problem in Bayesian analysis is to choose an appropriate prior π(Θk|𝔐k[.])\pi\left(\Theta_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right) under model 𝔐k[.]\mathfrak{M}_{k}^{[.]}. The subjective Bayesian inference theory suggests that π(Θk|𝔐k[.])\pi\left(\Theta_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right) should be based on a person’s prior opinion on Θk\Theta_{k} (Wasserman,, 2000). More common Bayesian model selection approach is based on objective theory where π(Θk|𝔐k[.])\pi\left(\Theta_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right) is chosen to be noninformative in some sense (Wasserman,, 2000). A philosophical thinking behind this approach can be found in Kass and Wasserman, (1996). It is well known that if the common parameters are orthogonal to the rest of the parameters in each of the KK models, they can be assigned the same prior distribution since the Fisher Information matrix is block diagonal.(Bayarri et al.,, 2008; Jeffreys,, 1961; Kass and Vaidyanathan,, 1992). Since the arbitrary constant would be canceled in the Bayes factor, we use noninformative (or improper) prior in our case. A widely recognized noninformative prior is Jeffreys’ prior, defined as π(Θk|𝔐k[.])|I(Θ¯k)|1/2\pi\left(\Theta_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right)\propto\left|I\left(\bar{\Theta}_{k}\right)\right|^{1/2}. In this case (Θ¯k):=𝔼Θ¯k[f1[.](𝐲|Θk,𝔐k[.])]2>0\mathcal{I}\left(\bar{\Theta}_{k}\right):=\mathbb{E}_{\bar{\Theta}_{k}}\left[f_{1}^{[.]}\left(\mathbf{y}\big{|}\Theta_{k},\mathfrak{M}_{k}^{[.]}\right)\right]^{2}>0 is the Fisher information matrix as defined in Assumption 1. For example, if 𝐘N(Θk,I)\mathbf{Y}\sim N(\Theta_{k},I) then Jeffreys’ prior is a flat prior π(Θk|𝔐k[.])1\pi\left(\Theta_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right)\sim 1, where II and N(,)N(,) represent an identity matrix and a multivariate normal distribution, respectively (Wasserman,, 2000).

Since α^\hat{\alpha} and θ\theta in ZIP are not orthogonal, following Bayarri et al., (2008) with α^=α^+(1α^)exp(θ)\hat{\alpha}^{*}=\hat{\alpha}+(1-\hat{\alpha})\exp(-\theta) the density function fZIP(y|α^,θ)f^{ZIP}(y|\hat{\alpha},\theta) can be reparametrized as

fZIP(y|α^,θ)=α^𝟙(y=0)+(1α^)fTP(y|θ),y=0,1,2,,f_{*}^{ZIP}(y|\hat{\alpha}^{*},\theta)=\hat{\alpha}^{*}\mathbbm{1}(y=0)+(1-\hat{\alpha}^{*})f_{T}^{P}(y|\theta),\ \ y=0,1,2,..., (5)

where fTP(y|θ):=P(Y=y|Y>0;θ)=θy/{y![exp(θ)1]}f_{T}^{P}(y|\theta):=P(Y=y|Y>0;\theta)=\theta^{y}/\{y![\exp(\theta)-1]\} is the zero-truncated Poisson distribution with parameter θ\theta such that α^exp(θ)\hat{\alpha}^{*}\geq\exp(-\theta). Therefore, the expression for model 𝔐0P\mathfrak{M}_{0}^{P} is

fP(y|θ)=exp(θ)𝟙(y=0)+[1exp(θ)]fTP(y|θ),y=0,1,2,.f_{*}^{P}(y|\theta)=\exp(-\theta)\mathbbm{1}(y=0)+[1-\exp(-\theta)]f_{T}^{P}(y|\theta),\ \ y=0,1,2,.... (6)

According to the suggestions in Maity and Paul, (2022) with α=α+(1α)(1+κ/γ)γ\alpha^{*}=\alpha+(1-\alpha)(1+\kappa/\gamma)^{-\gamma} and for all α(1+κ/γ)γ\alpha^{*}\geq(1+\kappa/\gamma)^{-\gamma} the density function fZINB(y|α,γ,κ)f^{ZINB}(y|\alpha,\gamma,\kappa) can be represented as

fZINB(y|α,κ)=α𝟙(y=0)+(1α)fTNB(y|κ),y=0,1,2,,f_{*}^{ZINB}(y|\alpha,\kappa)=\alpha^{*}\mathbbm{1}(y=0)+(1-\alpha^{*})f_{T}^{NB}(y|\kappa),\ \ y=0,1,2,..., (7)

where

fTNB(y|κ):=P(Y=y|Y>0;κ)=Γ(y+γ)y!Γ(γ)(1+κγ)γ(1+γκ)y1(1+κγ)γ,f_{T}^{NB}(y|\kappa):=P(Y=y|Y>0;\kappa)=\frac{\frac{\Gamma(y+\gamma)}{y!\Gamma(\gamma)}\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\left(1+\frac{\gamma}{\kappa}\right)^{-y}}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}},

is the zero-truncated negative binomial distributions with parameter κ\kappa such that the expression for model 𝔐0NB\mathfrak{M}_{0}^{NB} is

fNB(y|κ)=(1+κγ)γ𝟙(y=0)+[1(1+κγ)γ]fTNB(y|κ),y=0,1,2,.f_{*}^{NB}(y|\kappa)=\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\mathbbm{1}(y=0)+\left[1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right]f_{T}^{NB}(y|\kappa),\ \ y=0,1,2,.... (8)

As suggested by Bayarri et al., (2008), Jeffreys prior can be used for the common parameter and a proper prior for the extra parameters. It is well known that the Jeffreys prior for θ\theta in Poisson model, and for γ\gamma and κ\kappa in negative binomial model are πJP(θ|𝔐kP)=1/θ\pi_{J}^{P}\left(\theta\big{|}\mathfrak{M}_{k}^{P}\right)=1/\sqrt{\theta} and πJNB(κ|𝔐kNB)=γ/[κ(γ+κ)]\pi_{J}^{NB}\left(\kappa\big{|}\mathfrak{M}_{k}^{NB}\right)=\sqrt{\gamma/[\kappa(\gamma+\kappa)]}, respectively (Bayarri et al.,, 2008; Maity and Paul,, 2022). The Jeffreys prior for orthogonal ZIP (i.e., the Jeffreys prior of fTP(y|θ)f_{T}^{P}(y|\theta)) can be expressed as

πJZIP(θ|𝔐kZIP)=c1(θ)θ,wherec1(θ)=1(1+θ)exp(θ)1exp(θ).\pi_{J}^{ZIP}\left(\theta\big{|}\mathfrak{M}_{k}^{ZIP}\right)=\frac{c_{1}(\theta)}{\sqrt{\theta}},\ \text{where}\ \ c_{1}(\theta)=\frac{\sqrt{1-(1+\theta)\exp(-\theta)}}{1-\exp(-\theta)}.

In a similar fashion we can determine the Jeffreys prior for orthogonal ZINB (i.e., the Jeffreys prior of fTNB(y|κ)f_{T}^{NB}(y|\kappa)) can be expressed as

πJZINB(κ|𝔐kZINB)=c2(κ)γκ(κ+γ),\pi_{J}^{ZINB}\left(\kappa\big{|}\mathfrak{M}_{k}^{ZINB}\right)=c_{2}(\kappa)\sqrt{\frac{\gamma}{\kappa(\kappa+\gamma)}},

where,

c2(κ)=γ2κ2[1(1+κγ)γ][2(κ+γ)11(1+κγ)γ]κκ+γ+κ(κ+γ)(1+κγ)(2+γ)γ[1(1+κγ)γ][1+1γ+11(1+κγ)γ],c_{2}(\kappa)=\sqrt{\frac{\gamma^{2}}{\kappa^{2}\left[1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right]}\left[2-\frac{(\kappa+\gamma)^{-1}}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}}\right]-\frac{\kappa}{\kappa+\gamma}+\frac{\kappa(\kappa+\gamma)\left(1+\frac{\kappa}{\gamma}\right)^{-(2+\gamma)}}{\gamma\left[1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right]}\left[1+\frac{1}{\gamma}+\frac{1}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}}\right]}, (9)

for all κ,γ>0\kappa,\gamma>0. The derivation of Equation (9) is presented in the Appendix. Since we need to choose a single prior for both of the NB and ZINB, and as Maity and Paul, (2022) yields that working with any of πJNB(κ|𝔐kNB)\pi_{J}^{NB}\left(\kappa\big{|}\mathfrak{M}_{k}^{NB}\right) and πJZINB(α|κ,𝔐kZINB)\pi_{J}^{ZINB}\left(\alpha\big{|}\kappa,\mathfrak{M}_{k}^{ZINB}\right) will add negligible error in computing, we are going to choose the simpler prior version of πJNB(κ|𝔐kNB)=γ/[κ(γ+κ)]\pi_{J}^{NB}\left(\kappa\big{|}\mathfrak{M}_{k}^{NB}\right)=\sqrt{\gamma/[\kappa(\gamma+\kappa)]} for both of the NB and ZINB cases. In a similar fashion the simpler prior πJP(θ|𝔐kP)=1/θ\pi_{J}^{P}\left(\theta\big{|}\mathfrak{M}_{k}^{P}\right)=1/\sqrt{\theta} can be used for Poisson and ZIP cases (Bayarri et al.,, 2008). Under orthogonal ZIP model a proper prior for α^\hat{\alpha}^{*} given θ\theta is a uniform distribution over (exp(θ),1)(\exp(-\theta),1) is

πJZIP(α^|θ,𝔐kZIP)=1θ𝟙[exp(θ)<α^<1],and furthermore,πJZIP(α^|θ,𝔐kZIP)=1θ𝟙[0<α^<1].\pi_{J}^{ZIP}\left(\hat{\alpha}^{*}\big{|}\theta,\mathfrak{M}_{k}^{ZIP}\right)=\frac{1}{\sqrt{\theta}}\mathbbm{1}\left[\exp(-\theta)<\hat{\alpha}^{*}<1\right],\ \text{and furthermore,}\ \pi_{J}^{ZIP}\left(\hat{\alpha}\big{|}\theta,\mathfrak{M}_{k}^{ZIP}\right)=\frac{1}{\sqrt{\theta}}\mathbbm{1}\left[0<\hat{\alpha}<1\right].

Similarly, for an orthogonal ZINB model, a proper prior for α\alpha^{*} given κ\kappa is a uniform distribution over the interval ((1+κ/γ)γ,1)((1+\kappa/\gamma)^{-\gamma},1) it can be expressed as

πJZINB(α|κ,𝔐kZINB)=γκ(γ+κ)𝟙[(1+κγ)γ<α<1]or,πJZINB(α|κ,𝔐kZINB)=γκ(γ+κ)𝟙[0<α<1].\pi_{J}^{ZINB}\left(\alpha^{*}\big{|}\kappa,\mathfrak{M}_{k}^{ZINB}\right)=\sqrt{\frac{\gamma}{\kappa(\gamma+\kappa)}}\mathbbm{1}\left[\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}<\alpha^{*}<1\right]\ \text{or,}\ \pi_{J}^{ZINB}\left(\alpha\big{|}\kappa,\mathfrak{M}_{k}^{ZINB}\right)=\sqrt{\frac{\gamma}{\kappa(\gamma+\kappa)}}\mathbbm{1}\left[0<\alpha<1\right].

2.3 Objective Bayes factor in models with Poisson, NB, ZIP and ZINB distributions

In this section we are going to determine objective Bayes factors for each of the models explained in 1-4. For a sample of nn counts let ϖ=i=1n𝟙(Yi=0)\varpi=\sum_{i=1}^{n}\mathbbm{1}(Y_{i}=0) be the number of zero observations, and φ=i=1nYi\varphi=\sum_{i=1}^{n}Y_{i} be the total count. It is important to note that (ϖ=n)(φ=0)(\varpi=n)\equiv(\varphi=0) (Bayarri et al.,, 2008). Therefore, by Bayarri et al., (2008) for given data set 𝐲\mathbf{y}

fP(𝐲|θ)=θφexp(nθ)i=1nyi!,andfZIP(𝐲|α^,θ)=[α^+(1α^)exp(θ)]ϖ(1α^)nϖexp{(nϖ)θ}θφi=1nyi!,f^{P}(\mathbf{y}|\theta)=\frac{\theta^{\varphi}\exp(-n\theta)}{\prod_{i=1}^{n}y_{i}!},\ \text{and}\ f^{ZIP}(\mathbf{y}|\hat{\alpha},\theta)=\frac{[\hat{\alpha}+(1-\hat{\alpha})\exp(-\theta)]^{\varpi}(1-\hat{\alpha})^{n-\varpi}\exp\{-(n-\varpi)\theta\}\theta^{\varphi}}{\prod_{i=1}^{n}y_{i}!},

and by Maity and Paul, (2022)

fNB(𝐲|κ)\displaystyle f^{NB}(\mathbf{y}|\kappa) =(γγ+κ)nγ(κγ+κ)φ[i=1nΓ(yi+γ)yi!Γ(γ)],\displaystyle=\left(\frac{\gamma}{\gamma+\kappa}\right)^{n\gamma}\left(\frac{\kappa}{\gamma+\kappa}\right)^{\varphi}\left[\prod_{i=1}^{n}\frac{\Gamma(y_{i}+\gamma)}{y_{i}!\Gamma(\gamma)}\right],
fZINB(𝐲|α,κ)\displaystyle f^{ZINB}(\mathbf{y}|\alpha,\kappa) =[α+(1α)(1+κγ)γ]ϖ(γγ+κ)(nϖ)γ(κγ+κ)φ[i=1nΓ(yi+γ)yi!Γ(γ)].\displaystyle=\left[\alpha+(1-\alpha)\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right]^{\varpi}\left(\frac{\gamma}{\gamma+\kappa}\right)^{(n-\varpi)\gamma}\left(\frac{\kappa}{\gamma+\kappa}\right)^{\varphi}\left[\prod_{i=1}^{n}\frac{\Gamma(y_{i}+\gamma)}{y_{i}!\Gamma(\gamma)}\right].

For φ>0\varphi>0 the marginal likelihood function of Poisson and ZIP distributions with Jeffreys priors πJP(θ|𝔐kP)\pi_{J}^{P}\left(\theta\big{|}\mathfrak{M}_{k}^{P}\right) and πJZIP(α^|θ,𝔐kZIP)\pi_{J}^{ZIP}\left(\hat{\alpha}\big{|}\theta,\mathfrak{M}_{k}^{ZIP}\right) respectively are

m(𝐲|𝔐kP)=Γ(φ+12)nφ+12i=1nyi!,andm(𝐲|𝔐kZIP)=ϖ!(n+1)!i=1nyi!j=0ϖ(nj)!(ϖj)!Γ(φ+12)(nj)(φ+12).m\left(\mathbf{y}\big{|}\mathfrak{M}_{k}^{P}\right)=\frac{\Gamma\left(\varphi+\frac{1}{2}\right)}{n^{\varphi+\frac{1}{2}}\prod_{i=1}^{n}y_{i}!},\ \text{and}\ \ m\left(\mathbf{y}\big{|}\mathfrak{M}_{k}^{ZIP}\right)=\frac{\varpi!}{(n+1)!\prod_{i=1}^{n}y_{i}!}\sum_{j=0}^{\varpi}\frac{(n-j)!}{(\varpi-j)!}\Gamma\left(\varphi+\frac{1}{2}\right)(n-j)^{-\left(\varphi+\frac{1}{2}\right)}.

On the other hand, following (Maity and Paul,, 2022) the marginal likelihood function of NB distribution with Jeffreys prior πJNB(κ|𝔐kNB)\pi_{J}^{NB}\left(\kappa\big{|}\mathfrak{M}_{k}^{NB}\right) is

m(𝐲|𝔐kNB)=1γ[i=1nΓ(yi+γ)yi!Γ(γ)]nγ+12Beta(φ+12,nγ),m\left(\mathbf{y}\big{|}\mathfrak{M}_{k}^{NB}\right)=\frac{1}{\sqrt{\gamma}}\left[\prod_{i=1}^{n}\frac{\Gamma(y_{i}+\gamma)}{y_{i}!\Gamma(\gamma)}\right]^{n\gamma+\frac{1}{2}}\text{Beta}\left(\varphi+\frac{1}{2},n\gamma\right),

where “Beta” represents a beta integration. Similarly by Maity and Paul, (2022), the marginal likelihood function of ZINB distribution with Jeffreys prior πJZINB(α|κ,𝔐kZINB)\pi_{J}^{ZINB}\left(\alpha\big{|}\kappa,\mathfrak{M}_{k}^{ZINB}\right) is

m(𝐲|𝔐kZINB)=ϖ!γ(n+1)![i=1nΓ(yi+γ)yi!Γ(γ)]nγ+12j=0ϖ(nj)!(ϖj)!Beta(φ+12,(nj)γ).m\left(\mathbf{y}\big{|}\mathfrak{M}_{k}^{ZINB}\right)=\frac{\varpi!}{\sqrt{\gamma}\ (n+1)!}\left[\prod_{i=1}^{n}\frac{\Gamma(y_{i}+\gamma)}{y_{i}!\Gamma(\gamma)}\right]^{n\gamma+\frac{1}{2}}\sum_{j=0}^{\varpi}\frac{(n-j)!}{(\varpi-j)!}\ \text{Beta}\left(\varphi+\frac{1}{2},(n-j)\gamma\right).

The Bayes factor of the NB model against the Poisson model (i.e., Hypothesis 1) is

β101(𝐲)m(𝐲|𝔐1NB)m(𝐲|𝔐0P)=n(φ+12)Γ(nγ)γΓ(φ+nγ+12)[i=1nΓ(yi+γ)Γ(γ)]nγ+12[i=1n1yi!]nγ12.\beta_{10}^{1}(\mathbf{y})\triangleq\frac{m\left(\mathbf{y}\big{|}\mathfrak{M}_{1}^{NB}\right)}{m\left(\mathbf{y}\big{|}\mathfrak{M}_{0}^{P}\right)}=\frac{n^{\left(\varphi+\frac{1}{2}\right)}\Gamma(n\gamma)}{\sqrt{\gamma}\ \Gamma\left(\varphi+n\gamma+\frac{1}{2}\right)}\left[\prod_{i=1}^{n}\frac{\Gamma(y_{i}+\gamma)}{\Gamma(\gamma)}\right]^{n\gamma+\frac{1}{2}}\left[\prod_{i=1}^{n}\frac{1}{y_{i}!}\right]^{n\gamma-\frac{1}{2}}. (10)

It is important to note that, the Bayes factor β101\beta_{10}^{1} is increasing in total count φ\varphi for any given γ\gamma and nn. When φ=0\varphi=0 or equivalently all counts are zero (𝐲=0\mathbf{y}=0), β101(0)=n1/2Γ(nγ)/[γΓ(nγ+1/2)]<\beta_{10}^{1}(0)=n^{1/2}\Gamma(n\gamma)/[\sqrt{\gamma}\ \Gamma(n\gamma+1/2)]<\infty. Following Bayarri et al., (2008) the Bayes factor of the ZIP model against the Poisson model (i.e., Hypothesis 2) is

β102(𝐲)m(𝐲|𝔐1ZIP)m(𝐲|𝔐0P)=ϖ!(n+1)!j=0ϖ(nj)!(ϖj)!(1jn)(φ+12).\beta_{10}^{2}(\mathbf{y})\triangleq\frac{m\left(\mathbf{y}\big{|}\mathfrak{M}_{1}^{ZIP}\right)}{m\left(\mathbf{y}\big{|}\mathfrak{M}_{0}^{P}\right)}=\frac{\varpi!}{(n+1)!}\sum_{j=0}^{\varpi}\frac{(n-j)!}{(\varpi-j)!}\left(1-\frac{j}{n}\right)^{-\left(\varphi+\frac{1}{2}\right)}.

Bayarri et al., (2008) suggests that when φ=0\varphi=0, m(𝐲=0|𝔐0P)=Γ(1/2)/nm\left(\mathbf{y}=0\big{|}\mathfrak{M}_{0}^{P}\right)=\Gamma(1/2)/\sqrt{n} and m(𝐲=0|𝔐1ZIP)=m\left(\mathbf{y}=0\big{|}\mathfrak{M}_{1}^{ZIP}\right)=\infty which implies β102(0)=\beta_{10}^{2}(0)=\infty. In this case, for a given nn, β102(𝐲)\beta_{10}^{2}(\mathbf{y}) is increasing in φ\varphi for any fixed ϖ\varpi, and is increasing in ϖ\varpi for any given φ\varphi (Bayarri et al.,, 2008). Now the Bayes factor of the ZINB model against the NB model (i.e., Hypothesis 3) is

β103(𝐲)m(𝐲|𝔐1ZINB)m(𝐲|𝔐0NB)=ϖ!Γ(φ+nγ+12)(n+1)!Γ(nγ)j=0ϖ(nj)!Γ((nj)γ)(ϖj)!Γ(φ+(nj)γ+12).\beta_{10}^{3}(\mathbf{y})\triangleq\frac{m\left(\mathbf{y}\big{|}\mathfrak{M}_{1}^{ZINB}\right)}{m\left(\mathbf{y}\big{|}\mathfrak{M}_{0}^{NB}\right)}=\frac{\varpi!\Gamma\left(\varphi+n\gamma+\frac{1}{2}\right)}{(n+1)!\Gamma(n\gamma)}\sum_{j=0}^{\varpi}\frac{(n-j)!\Gamma((n-j)\gamma)}{(\varpi-j)!\Gamma\left(\varphi+(n-j)\gamma+\frac{1}{2}\right)}.

For any give nn and γ\gamma if φ=0\varphi=0 then, β103(0)=j=0nΓ((nj)γ)Γ(nγ+1/2)/[nΓ(nγ)Γ((nj)γ+1/2)]<\beta_{10}^{3}(0)=\sum_{j=0}^{n}\Gamma((n-j)\gamma)\Gamma(n\gamma+1/2)/[n\Gamma(n\gamma)\Gamma((n-j)\gamma+1/2)]<\infty. Finally, the Bayes factor of the ZINB model against the model ZIP (i.e., Hypothesis 4) is

β104(𝐲)m(𝐲|𝔐1ZINB)m(𝐲|𝔐0ZIP)=1γ[i=1nΓ(yi+γ)Γ(γ)]nγ+12[i=1n1yi!]nγ12j=0ϖΓ((nj)γ)(nj)(φ+12)Γ(φ+(nj)γ+12).\beta_{10}^{4}(\mathbf{y})\triangleq\frac{m\left(\mathbf{y}\big{|}\mathfrak{M}_{1}^{ZINB}\right)}{m\left(\mathbf{y}\big{|}\mathfrak{M}_{0}^{ZIP}\right)}=\frac{1}{\sqrt{\gamma}}\left[\prod_{i=1}^{n}\frac{\Gamma(y_{i}+\gamma)}{\Gamma(\gamma)}\right]^{n\gamma+\frac{1}{2}}\left[\prod_{i=1}^{n}\frac{1}{y_{i}!}\right]^{n\gamma-\frac{1}{2}}\sum_{j=0}^{\varpi}\frac{\Gamma((n-j)\gamma)}{(n-j)^{-\left(\varphi+\frac{1}{2}\right)}\Gamma\left(\varphi+(n-j)\gamma+\frac{1}{2}\right)}.

It can be easily verified that for any given nn and γ\gamma, β104(𝐲)\beta_{10}^{4}(\mathbf{y}) is strictly increasing in φ\varphi. Furthermore, when φ=0\varphi=0, β104(0)=(γ)1/2j=0nΓ((nj)γ)/[(nj)1/2Γ((nj)γ+1/2)]<\beta_{10}^{4}(0)=(\gamma)^{-1/2}\sum_{j=0}^{n}\Gamma((n-j)\gamma)/\left[(n-j)^{-1/2}\Gamma\left((n-j)\gamma+1/2\right)\right]<\infty.

3 Simulation Study

In this section we carry out a series of simulation studies to estimate some operating characteristics of the Bayes factors derived in the previous Section.

3.1 Bayes factor of Negative Binomial against Poisson

In the first experiment, we generate 1000 simulated datasets from either the NB distribution or the Poisson distribution with different parameter settings. The exact values of the parameters are given in Table 2. For each simulation, we compute the Bayes factor derived in Section 2.32.3 that is the evidence of the ZINB distribution against the NB distribution. Note that, when computing the Bayes factor, γ\gamma has been assumed to be fixed as discussed in Section 2. Empirically, it has been noted that γ=1.001\gamma=1.001 offers the best outcome.

In the following, it is said that the Bayes factor fevers the NB model against the Poisson model if the computed log(Bayes factor) is more than log(3.2) or log(10). If the computed log(Bayes factor) is more than log(3.2) then the evidence is substantial and if the computed log(Bayes factor) is more than log(10) then it is said that there is strong evidence that the model under consideration is a NB model (see Table 1). On the other hand, if the computed log(Bayes factor) is less than log(3.2) or log(10), then the evidence is substantial or strong respectively in the favor of Poisson model. In terms of the notations introduced in Section 2, if we denote NB and Poisson model by 𝔐1\mathfrak{M}_{1} and 𝔐0\mathfrak{M}_{0} then Table 1 is directly applicable to draw the inference.

Table 2: Simulation result to count how many times Statistical procedures favor NB model against the Poisson model. BF3: Number of times the log(Bayes factor) is more than 3.2 (when the data generating model is NB) or less than 1/3.2 (when the data generating model is Poisson), BF10: Number of times the log(Bayes factor) is more than 10 (when the data generating model is NB) or less than 1/10 (when the data generating model is Poisson), Vuong: number of times the data generating model is selected by Vuong’s test, AIC: number of times the data generating model is selected by AIC criterion out of 1000 simulations.
λ\lambda γ\gamma κ\kappa Data Generating Model BF3 BF10 Vuong AIC
0.5 1.5 0.5 NB 969 900 59 628
Pois 66 17 123 928
0.5 0.5 NB 1000 998 518 979
Pois 68 14 126 945
0.5 1.5 NB 1000 1000 999 1000
Pois 75 19 126 941
1 1.5 0.5 NB 972 897 46 597
Pois 431 304 116 933
0.5 0.5 NB 999 995 517 973
Pois 426 301 116 942
0.5 1.5 NB 1000 1000 995 1000
Pois 393 263 100 935
3 1.5 0.5 NB 961 893 63 608
Pois 988 980 102 933
0.5 0.5 NB 999 996 533 976
Pois 989 982 106 934
0.5 1.5 NB 1000 1000 996 1000
Pois 992 986 58 942
5 1.5 0.5 NB 963 929 63 619
Pois 1000 1000 84 935
0.5 0.5 NB 1000 998 519 978
Pois 1000 1000 104 936
0.5 1.5 NB 1000 1000 996 1000
Pois 1000 1000 111 927

Table 2 summarizes the result how many times the zero inflated model is selected out of 1000 simulations using the Bayes factor comparisons. Additionally, we have included the outcome using the Vuong’s test (Vuong,, 1989) and akaiake information criterion (AIC, (Akaike,, 1998)). R package nonnest2 (Merkle and You,, 2020; R Core Team,, 2021) has been utilized to carry out Vuong’s test.

Nevertheless, it is evident from Table 2 that Bayes factor remains superior in selecting the correct model if the data generating model follows a NB distribution. It remains superior in selecting the correct model when the data generating model is a Poisson model if the mean of the Poisson distribution is high. Moreover, the criterion – log(Bayes factor) more than log(3.2) (BF3) – selects the zero inflated model more often than the criterion – log(Bayes factor) more than log(10) (BF10) – for obvious reason. For instance, with data generating λ=5,γ=1.5,κ=0.5\lambda=5,\gamma=1.5,\kappa=0.5, when the sample is simulated from a NB distribution, then BF3 and BF10 are able to recover the NB distribution 963 times and 929 times respectively. On the other hand, AIC criterion indicates that 619 datasets follows the NB model out of 1000 simulated datasets. With the same data generating parameters, when the data are simulated from a Poisson distribuion, then, BF3 and B10 are able to recover the Poisson model 1000 times and 1000 times respctively, outperfroming the AIC creterion which is able to indicate in the favor of the Poisson model 935 times. Note that, the performance of Vuong’s test remains inferior throughout the simulation studies.

3.2 Bayes factor of Zero Inflated Poisson against Poisson

In the second experiment, we generate 1000 simulated datasets from either the zero inflated Poisson (ZIP) distribution or the Poisson distribution with different parameter settings. The exact values of the parameters are given in Table 3. The data generation and the inference follows the similar paths as the first simulated example.

Table 3: Simulation result to count how many times Statistical procedures favor ZIP model against the Poisson model. BF3: Number of times the log(Bayes factor) is more than 3.2 (when the data generating model is ZIP) or less than 1/3.2 (when the data generating model is Poisson), BF10: Number of times the log(Bayes factor) is more than 10 (when the data generating model is ZIP) or less than 1/10 (when the data generating model is Poisson), Vuong: number of times the data generating model is selected by Vuong’s test, Inflation: Number of times it is predicted that the data are zero inflated, AIC: number of times the data generating model is selected by AIC criterion out of 1000 simulations. Percentage (%) of Zeros: Percentage of zeros present in the data.
λ\lambda Percentage (%) of Zeros Data Generating Model BF3 BF10 Vuong Inflation AIC
0.5 97.7 ZIP 415 294 36 2 415
60.9 Pois 559 28 45 814 939
90.3 ZIP 590 362 80 50 644
60.8 Pois 567 35 51 790 935
80.3 ZIP 390 191 36 195 519
60.6 Pois 578 23 38 789 929
70.6 ZIP 135 48 9 178 258
60.5 Pois 560 18 44 799 943
1 96.8 ZIP 765 633 173 75 765
37 Pois 810 349 46 388 922
84.3 ZIP 937 859 508 743 958
36.8 Pois 815 320 47 409 944
68.2 ZIP 869 715 419 935 949
36.6 Pois 795 322 47 385 926
52.5 ZIP 390 220 77 806 643
36.7 Pois 803 338 50 390 918
3 95.2 ZIP 995 989 810 868 995
4.9 Pois 959 853 72 207 926
76.5 ZIP 1000 1000 999 1000 1000
4.9 Pois 963 845 66 194 937
52.5 ZIP 1000 1000 1000 1000 1000
4.9 Pois 959 845 72 203 934
28.6 ZIP 1000 999 995 1000 1000
5 Pois 956 849 80 228 934
5 95 ZIP 1000 1000 1000 1000 1000
1.3 Pois 961 884 0 642 884
74.9 ZIP 1000 1000 1000 1000 1000
1.4 Pois 962 879 0 710 881
50.3 ZIP 1000 1000 1000 1000 1000
1.4 Pois 964 892 0 620 891
25.5 ZIP 1000 1000 1000 1000 1000
1.4 Pois 964 898 0 652 896

Table 3 summarizes the result how many times the zero inflated model is selected out of 1000 simulations using the Bayes factor comparisons, Vuong’s test and the AIC criterion. An additional outcome has been included using the R package performance written by Lüdecke et al., (2021). This package offers functionality to check if excessive amount of zeros are present in the data. In this way, if it is determined that the number of existing zero’s are than the usual then one can conclude that the data follows a zero inflated distribution.

Nevertheless, it is evident from Table 3 that Bayes factor remains superior in selecting the correct model, particularly, when the mean of the Possion distribution is large. The other inferences remain similar to the first simulated example.

3.3 Bayes factor of Zero Inflated Negative Binomial against Negative Binomial

In the third experiment, we generate 1000 simulated datasets from either the zero inflated Negative Binomial (ZINB) distribution or the Negative Binomial (NB) distribution with different parameter settings. The exact values of the parameters are given in Table 4. The data generation and the inference follows the similar paths as the previous examples.

Table 4: Simulation result to count how many times Statistical procedures favor ZINB model against the NB model. BF3: Number of times the log(Bayes factor) is more than 3.2 (when the data generating model is ZINB) or less than 1/3.2 (when the data generating model is NB), BF10: Number of times the log(Bayes factor) is more than 10 (when the data generating model is ZINB) or less than 1/10 (when the data generating model is NB), Vuong: number of times the data generating model is selected by Vuong’s test, Inflation: Number of times it is predicted that the data are zero inflated, AIC: number of times the data generating model is selected by AIC criterion out of 1000 simulations. Percentage (%) of Zeros: Percentage of zeros present in the data.
γ\gamma κ\kappa Percentage (%) of Zeros Data Generating Model BF3 BF10 Vuong Inflation AIC
1.5 0.5 96.9 ZINB 1000 1000 54 32.6 848
45.5 NB 40 0 20 910 470
86.8 ZINB 1000 1000 60 0 820
46.5 NB 50 0 30 909 485
73.8 ZINB 1000 1000 120 0 870
46.2 NB 60 10 20 930 440
59.8 ZINB 1000 990 40 0 720
47.0 NB 40 10 0 879 374
0.5 0.5 97.6 ZINB 1000 1000 34 966 896
57.3 NB 0 0 20 1000 450
92.4 ZINB 1000 1000 20 0 808
71.0 NB 0 0 0 1000 410
85.2 ZINB 1000 1000 20 0 737
71.0 NB 0 0 10 1000 420
77.6 ZINB 1000 1000 20 0 600
58.5 NB 0 0 10 1000 440
5 5 94.9 ZINB 1000 1000 646 545 1000
3.3 NB 1000 1000 20 265 510
75.9 ZINB 1000 1000 900 200 1000
3.2 NB 1000 10000 0 280 480
51.4 ZINB 1000 1000 987 953 1000
3.3 NB 1000 1000 41 301 499
27.7 ZINB 910 860 930 1000 1000
3.3 NB 1000 1000 20.8 271 521

Table 4 summarizes the result how many times the zero inflated model is selected out of 1000 simulations using the Bayes factor comparisons, Vuong’s test, inflation, and the AIC criterion. It is evident from Table 4 that Bayes factor remains superior in selecting the correct model, particularly, when the parameters of the NB distribution are large. The other inferences remain similar to the previous simulated examples.

3.4 Bayes factor of Zero Inflated Negative Binomial against Zero Inflated Poisson

In the last experiment, we generate 1000 simulated datasets from either the zero inflated Negative Binomial (ZINB) distribution or the zero inflated Poisson (ZIP) distribution with different parameter settings. The exact values of the parameters are given in Table 5. The data generation and the inference follows the similar paths as the previous examples.

Table 5: Simulation result to count how many times Statistical procedures favor ZINB model against the ZIP model. BF3: Number of times the log(Bayes factor) is more than 3.2 (when the data generating model is ZINB) or less than 1/3.2 (when the data generating model is NB), BF10: Number of times the log(Bayes factor) is more than 10 (when the data generating model is ZINB) or less than 1/10 (when the data generating model is NB), Vuong: number of times the data generating model is selected by Vuong’s test, Inflation: Number of times it is predicted that the data are zero inflated, AIC: number of times the data generating model is selected by AIC criterion out of 1000 simulations. Percentage (%) of Zeros: Percentage of zeros present in the data.
λ\lambda γ\gamma κ\kappa Percentage (%) of Zeros Data Generating Model BF3 BF10 Vuong AIC
1 0.5 0.5 84 ZINB 587 587 49 665
68.1 ZIP 990 990 61 624
1 5 5 53 ZINB 1000 1000 278 994
69.4 ZIP 989 989 45 603
3 0.5 0.5 86.6 ZINB 579 579 41 665
52.6 ZIP 518 518 101 556
3 5 5 53.2 ZINB 999 999 281 994
54.4 ZIP 518 518 84 573

Table 5 summarizes the result how many times the zero inflated model is selected out of 1000 simulations using the Bayes factor comparisons, Vuong’s test, and the AIC criterion. It is evident from Table 5 that Bayes factor remains superior in selecting the correct model.

4 Model Selection in Microbiome Data

In this Section we apply the Bayes factor computation techniques discussed here in a real life data originated from a case-control study. The objective of the original experiment was to gain knowledge of the vaginal microbioata throughout pregnancy. Toward this end, a longitudinal case control study was designed in 22 pregnant women who delivered at term (38 to 42 weeks) without complications, and 32 non-pregnant women. Serial samples of vaginal fluid were collected from both non-pregnant and pregnant patients. The data includes 16S rRNA gene sequence based vaginal microbiota from which samples are collected from each subject over intervals of weeks, resulting in 143 taxa and N = 900 longitudinal samples (139 measurements from pregnant women and 761 measurements from non-pregnant women.) For more details on the experiment see Romero et al., (2014); also see Jiang et al., (2023).

For the analysis, we focused on two specific Phylotypes: Lactobacillus.iners and Atopobium. Each dataset contained 900 observations, with the first dataset having 15.1% zeros and the second dataset having 66.3% zeros. We computed the log(Bayes factor) and AIC criteria for four models: Negative Binomial (NB), Poisson, Zero-Inflated Negative Binomial (ZINB), and Zero-Inflated Poisson (ZIP).

Table 6 presents the computed log(Bayes Factor) on the Microbiome data, while Table 7 displays the AIC values for each model. Note that, a Negative Binomial model cannot be fitted to the data because the underlying maximization process does not converge. For the same reason, the Bayes factor of NB against Poisson model cannot be computed.

For the first dataset, the log(Bayes factor) of ZINB against NB and of ZIP against Poisson are 829.0 and 171854.9, respectively, which favors a zero Inflated model for the data. Consequently, the log(Bayes factor) of ZINB against ZIP becomes -13686110.0 which implies that one should fit a zero inflated Poisson model to the data. On the other hand, the AIC criterion supports to fit a zero Inflated Negative Binomial model to the data. Romero et al., (2014) concluded in the favor of fitting a negative Binomial model.

Table 6: Computed log(Bayes Factor) on the Microbiome data.
Example Model log(Bayes factor)
1 NB vs. Poisson
ZINB vs. NB 829.0
ZIP vs. Poisson 171854.9
ZINB vs. ZIP -13686110.0
2 NB vs. Poisson
ZINB vs. NB 1172.6
ZIP vs. Poisson 5073.6
ZINB vs. ZIP 120266.8
Table 7: Computed log(Bayes Factor) on the Microbiome data.
Example Model AIC
1 NB
Poisson 1667918.0
ZINB 12513.0
ZIP 1324204.0
2 NB
Poisson 24913.9
ZINB 3342.2
ZIP 14763.3

A very similar analysis concludes that a ZINB model is the appropriate one to fit into the second dataset. This can be concluded by computing both th Bayes factor and the AIC. Furthermore, this inference accords with the findings of Romero et al., (2014).

Overall, the Bayes factor and AIC analyses provide insights into selecting the appropriate models for further analysis of the vaginal microbiota data obtained from the case-control study.

5 Discussion

In recent years, a significant effort has done in the literature of longitudinal metagenomics to investigate dynamic associations between microbial symbiosis and the development of many diseases, such as inflammatory bowl diseases (Sharpton et al.,, 2017; Minar, 2018a, ), colorectal cancers (Liang et al.,, 2014), Parkinson’s disease (Yang et al.,, 2018; Minar, 2018b, ; Minar,, 2019), preterm birth (Stewart et al.,, 2017), and autoimmune diseases (Vatanen et al.,, 2016; Zhang and Yi,, 2020; Roy et al., 2023c, ; Roy et al., 2023a, ). The literature discussed above either used 16S rRNA or whole-metagenome shotgun sequencing technologies to simulate longitudinal metagenomics data (Zhang and Yi,, 2020; Roy et al., 2023b, ). While the bioinformatics tools for processing 16S rRNA sequencing data give the counts, whole-metagenome shotgun sequencing data give either counts or proportions. In this article, we considered the longitudinal metagenomic count data generated from 16S rRNA sequencing based vaginal microbiota. Since the objective was to gain knowledge of the vaginal microbiota throughout pregnancy, a longitudinal case control study was designed in 22 pregnant women who delivered at term (38 to 42 weeks) without complications, and 32 non-pregnant women, and serial samples of vaginal fluid were collected from both non-pregnant and pregnant patients. Moreover, we analyzed on two specific Phylotypes: Lactobacillus.iners and Atopobium. Each dataset contained 900 observations, with the first dataset having 15.1% zeros and the second dataset having 66.3% zeros. We computed the log(Bayes factor) and AIC criteria for four models: Negative Binomial (NB), Poisson, Zero-Inflated Negative Binomial (ZINB), and Zero-Inflated Poisson (ZIP).

In this article, we presented Poisson, NB, ZIP, and ZINB distributions to analyze high-throughput sequencing microbiome data. First, we verified some convergence and measurability properties of the posterior distribution. Second, the Jeffreys prior was calculated for ZINB. Then the presence of over-dispersion was tested by using Bayesian methodologies. We introduced the Bayes factor for ZINB and ZIP and tested for the model selection under the incidence of over dispersed data. For each of the four distributions, we used non-informative Jeffreys prior and determined Bayes factors corresponding to the hypotheses 1-4. We did simulation studies of the distributions with different parameters to determine the effectiveness of Bayes’ factors (i.e., BF3 and BF10) compared with traditional AIC and Vuong’s test. We showed that BF3 and BF10 outperformed AIC and Vuong’s test in every case. For example, in the case of NB versus Poisson with λ=1\lambda=1, γ=1.5\gamma=1.5, and κ=0.5\kappa=0.5, when a sample is generated by simulating an NB, then BF3 and BF10 would be able to recover the NB distribution 972 and 897 times, respectively (see Table 2). On the other hand, AIC indicates that 597 datasets follow the NB distribution out of 1000 simulated datasets. In this case, Vuong’s test gives the most inferior result and throughout our simulation studies, its performance was the worst. To conduct the quantitative analysis, R package BFZINBZIP was used which is available at authors’ github account.

Our method is novel in identifying differentially 143 taxa for two patient groups (i.e., pregnant and non-pregnant women) under a single statistical framework which allows for an integrative analysis of the microbiome and other omics data sets. The proposed method can lead to proper clinical decisions corresponding to the precision shaping of the microbiome data. Furthermore, BF3 and BF10 proposed in this article perform better than AIC and Vuong’s test throughout our simulation studies and real data analysis. In real data analysis, since the underlying maximization process of 16S rRNA data do not converge, an NB distribution is impossible to fit. As a result, the Bayes factor of NB against the Poisson model cannot be determined. In Table, 6, the log(Bayes factor) of ZINB against NB, and of ZIP against Poisson are 829.0 and 171854.9, respectively, which supports the zero-inflated model for our data set. On the other hand, the log(Bayes factor) of ZINB against ZIP is -13686110.0 supports in favor of the implementation of a ZIP model to the data. Furthermore, the AIC criterion in Table 7 goes in favor of fitting a ZINB to the data. Tables 6 and 7 give similar results for the second set of data which favors the implementation of a ZINB model as the log(Bayes factor) and AIC are 120266.8 and 3342.2, respectively. This inference is similar to the results obtained in Romero et al., (2014).

The framework of the proposed method allows for several extensions. For example, the current model supports two groups (i.e., pregnant and non-pregnant women). We can extend our current model to multiple phenotype type groups including intermediate phenotypes. In this case, our method can incorporate group-specific parameters while holding other parameters fixed, and same poaterior inference can be incorporated. We can extend our proposed model to a regression framework where the normalized microbiome normalized abundance can be used as a the response which would integrate metabolite compounds, as predictors (Jiang et al.,, 2021). Another extension would be to discuss correlated covariates such as longitudinal clinical measurements (Zhang et al.,, 2017; Jiang et al.,, 2021).

Conflict of Interest

None declared.

Supplementary Material

An R package BFZINBZIP is available on Github:
https://github.com/arnabkrmaity/BFZINBZIP/tree/main. This package has been used to do simulations in Section 3 and real data analysis in Section 4.

Data Availability Statement

Romero data is available in their paper Romero et al., (2014) or directly from the R package NBZIMM.

Appendix

Proof of Lemma 1

In order to prove this lemma we will show that for the probability measure at Θ¯k\bar{\Theta}_{k} of model 𝔐k[.]\mathfrak{M}_{k}^{[.]} denoted as 𝒫Θk(Ω)=1\mathcal{P}_{\Theta_{k}}^{\infty}(\Omega^{*})=1

𝚯kp(𝔐1[.]|𝐲(ω))|1p(𝔐0[.]|𝐲(ω))p(𝔐1[.]|𝐲(ω))|𝑑ζ(Θk)0,\int_{\bm{\Theta}_{k}}p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}(\omega)\right)\left|1-\frac{p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}(\omega)\right)}{p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}(\omega)\right)}\right|d\zeta(\Theta_{k})\rightarrow 0,

where p(𝔐k[.]|𝐲(ω))p\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}(\omega)\right) be a posterior density distribution of kthk^{th} model so that k=0,1k=0,1. Using the continuity at point Θ¯kΘk\bar{\Theta}_{k}\subset\Theta_{k}, there {δ,ϵ,λ}\exists\{\delta,\epsilon,\lambda\} so that for all δ>0,ϵ>0\delta>0,\epsilon>0 and λ>0\lambda>0 there exists a neighborhood 𝒩\mathcal{N} of Θ¯k\bar{\Theta}_{k}such that Θk𝒩\forall\ \Theta_{k}\in\mathcal{N}, and k=1,2k=1,2,

|π(Θ0|𝔐0[.])π(Θ1|𝔐1[.])π(Θ¯0|𝔐0[.])π(Θ¯1|𝔐1[.])|<δ,and|π(Θ¯k|𝔐k[.])π(Θk|𝔐k[.])|<δ.\left|\frac{\pi\left(\Theta_{0}|\mathfrak{M}_{0}^{[.]}\right)}{\pi\left(\Theta_{1}|\mathfrak{M}_{1}^{[.]}\right)}-\frac{\pi\left(\bar{\Theta}_{0}|\mathfrak{M}_{0}^{[.]}\right)}{\pi\left(\bar{\Theta}_{1}|\mathfrak{M}_{1}^{[.]}\right)}\right|<\delta,\ \text{and}\ \left|\pi\left(\bar{\Theta}_{k}|\mathfrak{M}_{k}^{[.]}\right)-\pi\left(\Theta_{k}|\mathfrak{M}_{k}^{[.]}\right)\right|<\delta.

By consistency there exists a sample space Ω\Omega, 𝒫θ¯k(Ω)=1\mathcal{P}_{\bar{\theta}_{k}}^{\infty}(\Omega)=1, so that for each ωΩ\omega\in\Omega, we have the posterior probability at neighborhood 𝒩\mathcal{N} of Θk\Theta_{k} as

p𝒩(𝔐k[.]|𝐲(ω))=𝒩i=1nf[.](yi(ω)|𝚯k,𝔐k[.])π(Θk|𝔐k[.])dζ(Θk)𝚯ki=1nf[.](yi(ω)|𝚯k,𝔐k[.])π(Θk|𝔐k[.])dζ(Θk)1.p_{\mathcal{N}}\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}(\omega)\right)=\frac{\int_{\mathcal{N}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{k},\mathfrak{M}_{k}^{[.]}\right)\pi\left(\Theta_{k}|\mathfrak{M}_{k}^{[.]}\right)d\zeta(\Theta_{k})}{\int_{\bm{\Theta}_{k}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{k},\mathfrak{M}_{k}^{[.]}\right)\pi\left(\Theta_{k}|\mathfrak{M}_{k}^{[.]}\right)d\zeta(\Theta_{k})}\rightarrow 1.

For all ωΩ\omega\in\Omega there exists η\eta^{*} such that for all n>ηn>\eta^{*} the posterior probability is

p𝒩(𝔐k[.]|𝐲(ω))1λ,for allk=1,2.p_{\mathcal{N}}\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}(\omega)\right)\geq 1-\lambda,\ \text{for all}\ k=1,2.

Furthermore, the ratio of two posterior distributions is

p(𝔐0[.]|𝐲(ω))p(𝔐1[.]|𝐲(ω))=π(Θ0|𝔐0[.])π(Θ1|𝔐1[.])𝚯ki=1nf[.](yi(ω)|𝚯1,𝔐1[.])π(Θ1|𝔐1[.])dζ(Θ1)𝚯ki=1nf[.](yi(ω)|𝚯0,𝔐0[.])π(Θ0|𝔐0[.])dζ(Θ0).\frac{p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}(\omega)\right)}{p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}(\omega)\right)}=\frac{\pi\left(\Theta_{0}|\mathfrak{M}_{0}^{[.]}\right)}{\pi\left(\Theta_{1}|\mathfrak{M}_{1}^{[.]}\right)}\frac{\int_{\bm{\Theta}_{k}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{1},\mathfrak{M}_{1}^{[.]}\right)\pi\left(\Theta_{1}|\mathfrak{M}_{1}^{[.]}\right)d\zeta(\Theta_{1})}{\int_{\bm{\Theta}_{k}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{0},\mathfrak{M}_{0}^{[.]}\right)\pi\left(\Theta_{0}|\mathfrak{M}_{0}^{[.]}\right)d\zeta(\Theta_{0})}.

For all n>ηn>\eta^{*} and Θk𝒩\Theta_{k}\in\mathcal{N} yields

(1λ)[π(Θ¯0|𝔐0[.])π(Θ¯1|𝔐1[.])δ]\displaystyle(1-\lambda)\left[\frac{\pi\left(\bar{\Theta}_{0}|\mathfrak{M}_{0}^{[.]}\right)}{\pi\left(\bar{\Theta}_{1}|\mathfrak{M}_{1}^{[.]}\right)}-\delta\right] {𝒩i=1nf[.](yi(ω)|𝚯1,𝔐1[.])π(Θ1|𝔐1[.])dζ(Θ1)𝒩i=1nf[.](yi(ω)|𝚯0,𝔐0[.])π(Θ0|𝔐0[.])dζ(Θ0)}p(𝔐0[.]|𝐲(ω))p(𝔐1[.]|𝐲(ω))\displaystyle\left\{\frac{\int_{\mathcal{N}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{1},\mathfrak{M}_{1}^{[.]}\right)\pi\left(\Theta_{1}|\mathfrak{M}_{1}^{[.]}\right)d\zeta(\Theta_{1})}{\int_{\mathcal{N}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{0},\mathfrak{M}_{0}^{[.]}\right)\pi\left(\Theta_{0}|\mathfrak{M}_{0}^{[.]}\right)d\zeta(\Theta_{0})}\right\}\leq\frac{p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}(\omega)\right)}{p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}(\omega)\right)}
1(1λ)[π(Θ¯0|𝔐0[.])π(Θ¯1|𝔐1[.])+δ]{𝒩i=1nf[.](yi(ω)|𝚯1,𝔐1[.])π(Θ1|𝔐1[.])dζ(Θ1)𝒩i=1nf[.](yi(ω)|𝚯0,𝔐0[.])π(Θ0|𝔐0[.])dζ(Θ0)},\displaystyle\leq\frac{1}{(1-\lambda)}\left[\frac{\pi\left(\bar{\Theta}_{0}|\mathfrak{M}_{0}^{[.]}\right)}{\pi\left(\bar{\Theta}_{1}|\mathfrak{M}_{1}^{[.]}\right)}+\delta\right]\left\{\frac{\int_{\mathcal{N}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{1},\mathfrak{M}_{1}^{[.]}\right)\pi\left(\Theta_{1}|\mathfrak{M}_{1}^{[.]}\right)d\zeta(\Theta_{1})}{\int_{\mathcal{N}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{0},\mathfrak{M}_{0}^{[.]}\right)\pi\left(\Theta_{0}|\mathfrak{M}_{0}^{[.]}\right)d\zeta(\Theta_{0})}\right\},

and by the choice of 𝒩\mathcal{N},

[π(Θ¯k|𝔐k[.])δ]\displaystyle\left[\pi\left(\bar{\Theta}_{k}|\mathfrak{M}_{k}^{[.]}\right)-\delta\right] 𝒩i=1nf[.](yi(ω)|𝚯0,𝔐0[.])dζ(Θ0)𝒩i=1nf[.](yi(ω)|𝚯0,𝔐0[.])π(Θ0|𝔐0[.])dζ(Θ0)\displaystyle\int_{\mathcal{N}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{0},\mathfrak{M}_{0}^{[.]}\right)d\zeta(\Theta_{0})\leq\int_{\mathcal{N}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{0},\mathfrak{M}_{0}^{[.]}\right)\pi\left(\Theta_{0}|\mathfrak{M}_{0}^{[.]}\right)d\zeta(\Theta_{0})
[π(Θ¯k|𝔐k[.])+δ]𝒩i=1nf[.](yi(ω)|𝚯0,𝔐0[.])dζ(Θ0).\displaystyle\leq\left[\pi\left(\bar{\Theta}_{k}|\mathfrak{M}_{k}^{[.]}\right)+\delta\right]\int_{\mathcal{N}}\prod_{i=1}^{n}f^{[.]}\left(y_{i}(\omega)|\bm{\Theta}_{0},\mathfrak{M}_{0}^{[.]}\right)d\zeta(\Theta_{0}). (11)

For Θk𝒩\Theta_{k}\in\mathcal{N} the inequality Proof of Lemma 1 yields

(1λ)\displaystyle(1-\lambda) [π(Θ¯0|𝔐0[.])π(Θ¯1|𝔐1[.])δ][π(Θ¯1|𝔐1[.])δπ(Θ¯0|𝔐0[.])+δ]p(𝔐0[.]|𝐲(ω))p(𝔐1[.]|𝐲(ω))\displaystyle\left[\frac{\pi\left(\bar{\Theta}_{0}|\mathfrak{M}_{0}^{[.]}\right)}{\pi\left(\bar{\Theta}_{1}|\mathfrak{M}_{1}^{[.]}\right)}-\delta\right]\left[\frac{\pi\left(\bar{\Theta}_{1}|\mathfrak{M}_{1}^{[.]}\right)-\delta}{\pi\left(\bar{\Theta}_{0}|\mathfrak{M}_{0}^{[.]}\right)+\delta}\right]\leq\frac{p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}(\omega)\right)}{p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}(\omega)\right)}
11λ[π(Θ¯0|𝔐0[.])π(Θ¯1|𝔐1[.])+δ][π(Θ¯1|𝔐1[.])+δπ(Θ¯0|𝔐0[.])δ],\displaystyle\leq\frac{1}{1-\lambda}\left[\frac{\pi\left(\bar{\Theta}_{0}|\mathfrak{M}_{0}^{[.]}\right)}{\pi\left(\bar{\Theta}_{1}|\mathfrak{M}_{1}^{[.]}\right)}+\delta\right]\left[\frac{\pi\left(\bar{\Theta}_{1}|\mathfrak{M}_{1}^{[.]}\right)+\delta}{\pi\left(\bar{\Theta}_{0}|\mathfrak{M}_{0}^{[.]}\right)-\delta}\right],

so that for small values of δ\delta and λ\lambda we have

|p(𝔐0[.]|𝐲(ω))p(𝔐1[.]|𝐲(ω))|<ϵ.\left|\frac{p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}(\omega)\right)}{p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}(\omega)\right)}\right|<\epsilon.

Finally, for n>ηn>\eta^{*},

𝚯k\displaystyle\int_{\bm{\Theta}_{k}} |p(𝔐0[.]|𝐲(ω))p(𝔐1[.]|𝐲(ω))|dζ(Θk)\displaystyle\left|p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}(\omega)\right)-p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}(\omega)\right)\right|d\zeta(\Theta_{k})
𝒩p(𝔐1[.]|𝐲(ω))|1p(𝔐0[.]|𝐲(ω))p(𝔐1[.]|𝐲(ω))|𝑑ζ(Θk)+2λϵ(1λ)+2λ.\displaystyle\leq\int_{\mathcal{N}}p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}(\omega)\right)\left|1-\frac{p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}(\omega)\right)}{p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}(\omega)\right)}\right|d\zeta(\Theta_{k})+2\lambda\leq\epsilon(1-\lambda)+2\lambda.

This completes the proof.

Proof of Lemma 2

Consider two independent random variables ς1\varsigma_{1}, ς2\varsigma_{2} with posterior probability distribution functions p(𝔐0[.]|𝐲)p\left(\mathfrak{M}_{0}^{[.]}\big{|}\mathbf{y}\right), p(𝔐1[.]|𝐲)p\left(\mathfrak{M}_{1}^{[.]}\big{|}\mathbf{y}\right) respectively. Then by Piterbarg, (1996),

𝐩(𝔐k[.]|ς1+ς2>𝐲,ς2𝐲/4)\displaystyle\mathbf{p}\left(\mathfrak{M}_{k}^{[.]}\bigg{|}\varsigma_{1}+\varsigma_{2}>\mathbf{y},\varsigma_{2}\leq\mathbf{y}/4\right) 𝐩(𝔐k[.]|ε1>3𝐲/4)=O(𝐲φ0exp(9𝐲2/32))\displaystyle\leq\mathbf{p}\left(\mathfrak{M}_{k}^{[.]}\bigg{|}\varepsilon_{1}>3\mathbf{y}/4\right)=O\left(\mathbf{y}^{\varphi_{0}}\exp(-9\mathbf{y}^{2}/32)\right)
𝐩(𝔐k[.]|ς1+ς2>𝐲,ς23𝐲/4)\displaystyle\mathbf{p}\left(\mathfrak{M}_{k}^{[.]}\bigg{|}\varsigma_{1}+\varsigma_{2}>\mathbf{y},\varsigma_{2}\leq 3\mathbf{y}/4\right) 𝐩(𝔐k[.]|ε2>3𝐲/4)=O(𝐲φ1exp(9𝐲2/32)),\displaystyle\leq\mathbf{p}\left(\mathfrak{M}_{k}^{[.]}\bigg{|}\varepsilon_{2}>3\mathbf{y}/4\right)=O\left(\mathbf{y}^{\varphi_{1}}\exp(-9\mathbf{y}^{2}/32)\right), (12)

where 𝐩={p(𝔐0[.]|.),p(𝔐1[.]|.)}T\mathbf{p}=\left\{p\left(\mathfrak{M}_{0}^{[.]}\bigg{|}.\right),p\left(\mathfrak{M}_{1}^{[.]}\bigg{|}.\right)\right\}^{T} with TT represents the transposition of the matrix.

Now let us analyze the asymptotic properties of the finite integral

I=𝐲/43𝐲/4[1p(𝔐0[.]|𝐲𝐳)]𝑑p(𝔐1[.]|𝐳).I=\int_{\mathbf{y}/4}^{3\mathbf{y}/4}\left[1-p\left(\mathfrak{M}_{0}^{[.]}\bigg{|}\mathbf{y}-\mathbf{z}\right)\right]dp\left(\mathfrak{M}_{1}^{[.]}\bigg{|}\mathbf{z}\right). (13)

There exists m>0m>0 such that m0m\downarrow 0. For a positive integer ϵ\epsilon define mϵ=ϵm/𝐲m_{\epsilon}=\epsilon m/\mathbf{y} so that 𝐲2/(4m)\mathbf{y}^{2}/(4m) is an integer. Therefore,

I\displaystyle I 𝐲/4mϵ3𝐲/4[1p(𝔐0[.]|𝐲mϵ)][p(𝔐1[.]|mϵ)p(𝔐1[.]|mϵ1)]\displaystyle\leq\sum_{\mathbf{y}/4\leq m_{\epsilon}\leq 3\mathbf{y}/4}\left[1-p\left(\mathfrak{M}_{0}^{[.]}\bigg{|}\mathbf{y}-m_{\epsilon}\right)\right]\left[p\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon}\right)-p\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon-1}\right)\right]
I\displaystyle I 𝐲/4mϵ3𝐲/4[1p(𝔐0[.]|𝐲mϵ1)][p(𝔐1[.]|mϵ)p(𝔐1[.]|mϵ1)].\displaystyle\geq\sum_{\mathbf{y}/4\leq m_{\epsilon}\leq 3\mathbf{y}/4}\left[1-p\left(\mathfrak{M}_{0}^{[.]}\bigg{|}\mathbf{y}-m_{\epsilon-1}\right)\right]\left[p\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon}\right)-p\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon-1}\right)\right]. (14)

By condition

1p(𝔐k[.]|𝐲)=Ψk𝐲φkexp(𝐲2/2)(1+o(1)),as 𝐲,1-p\left(\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right)=\Psi_{k}\ \mathbf{y}^{\varphi_{k}}\exp(-\mathbf{y}^{2}/2)(1+o(1)),\ \text{as $\mathbf{y}\rightarrow\infty$,}

there exist two monotonically decreasing functions η0(𝐲)0\eta_{0}(\mathbf{y})\rightarrow 0, η1(𝐲)0\eta_{1}(\mathbf{y})\rightarrow 0 since 𝐲\mathbf{y}\rightarrow\infty so that for all 𝐲>0\mathbf{y}>0 we have,

Ψk𝐲φkexp(𝐲2/2)(1ηk(𝐲))1p(𝔐k[.]|𝐲)Ψk𝐲φkexp(𝐲2/2)(1+ηk(𝐲)),k=0,1.\Psi_{k}\mathbf{y}^{\varphi_{k}}\exp(\mathbf{y}^{2}/2)(1-\eta_{k}(\mathbf{y}))\leq 1-p\left(\mathfrak{M}_{k}^{[.]}\bigg{|}\mathbf{y}\right)\leq\Psi_{k}\mathbf{y}^{\varphi_{k}}\exp(\mathbf{y}^{2}/2)(1+\eta_{k}(\mathbf{y})),\ k=0,1.

Our main objective is to determine the estimate of the upper and the lower bounds of p(𝔐1[.]|mϵ)p(𝔐1[.]|mϵ1)p\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon}\right)-p\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon-1}\right) in condition Proof of Lemma 2. The upper bound is

p\displaystyle p (𝔐1[.]|mϵ)p(𝔐1[.]|mϵ1)\displaystyle\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon}\right)-p\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon-1}\right)
Ψ1mϵ1φ1exp(12mϵ12)[1+η1(mϵ1)]Ψ1mϵφ1exp(12mϵ2)[1η1(mϵ1)]\displaystyle\leq\Psi_{1}m_{\epsilon-1}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon-1}^{2}\right)\left[1+\eta_{1}(m_{\epsilon-1})\right]-\Psi_{1}m_{\epsilon}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon}^{2}\right)\left[1-\eta_{1}(m_{\epsilon-1})\right]
Ψ1mϵ1φ1exp(12mϵ12)Ψ1mϵφ1exp(12mϵ2)+2Ψ1η1(𝐲/4)mϵ1φ1exp(12mϵ12)\displaystyle\leq\Psi_{1}m_{\epsilon-1}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon-1}^{2}\right)-\Psi_{1}m_{\epsilon}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon}^{2}\right)+2\Psi_{1}\eta_{1}(\mathbf{y}/4)m_{\epsilon-1}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon-1}^{2}\right)
=Ψ1mϵφ1exp(12mϵ2)[(ϵ1ϵ)exp{(2ϵ1)m22𝐲2}1]+2Ψ1η1(𝐲/4)mϵ1φ1exp(12mϵ12)\displaystyle=\Psi_{1}m_{\epsilon}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon}^{2}\right)\left[\left(\frac{\epsilon-1}{\epsilon}\right)\exp\left\{\frac{(2\epsilon-1)m^{2}}{2\mathbf{y}^{2}}\right\}-1\right]+2\Psi_{1}\eta_{1}(\mathbf{y}/4)m_{\epsilon-1}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon-1}^{2}\right)
Ψ1mϵϕ1exp(12mϵ2)(ϵm2𝐲2)Υ(ϵm2𝐲2)+2Ψ1η1(𝐲/4)mϵ1φ1exp(12mϵ12)\displaystyle\leq\Psi_{1}m_{\epsilon}^{\phi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon}^{2}\right)\left(\frac{\epsilon m^{2}}{\mathbf{y}^{2}}\right)\Upsilon\left(\frac{\epsilon m^{2}}{\mathbf{y}^{2}}\right)+2\Psi_{1}\eta_{1}(\mathbf{y}/4)m_{\epsilon-1}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon-1}^{2}\right)
Ψ1Υ(3ϵ/4)mϵϕ1exp(12mϵ2)(ϵm2𝐲2)+2Ψ1η1(𝐲/4)mϵ1φ1exp(12mϵ12),\displaystyle\leq\Psi_{1}\Upsilon(3\epsilon/4)m_{\epsilon}^{\phi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon}^{2}\right)\left(\frac{\epsilon m^{2}}{\mathbf{y}^{2}}\right)+2\Psi_{1}\eta_{1}(\mathbf{y}/4)m_{\epsilon-1}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon-1}^{2}\right),

where Υ(𝐲)=[exp(𝐲)1]/𝐲\Upsilon(\mathbf{y})=[\exp(\mathbf{y})-1]/\mathbf{y}.

The lower bound is,

p\displaystyle p (𝔐1[.]|mϵ)p(𝔐1[.]|mϵ1)\displaystyle\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon}\right)-p\left(\mathfrak{M}_{1}^{[.]}\bigg{|}m_{\epsilon-1}\right)
(ϵ1ϵ)ϕ1exp(m22𝐲2)Ψ1mϵϕ1exp(12mϵ2)(ϵm2𝐲2)Υ(ϵm2𝐲2)2Ψ1η1(𝐲/4)mϵ1φ1exp(12mϵ12)\displaystyle\geq\left(\frac{\epsilon-1}{\epsilon}\right)^{\phi_{1}}\exp\left(\frac{-m^{2}}{2\mathbf{y}^{2}}\right)\Psi_{1}m_{\epsilon}^{\phi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon}^{2}\right)\left(\frac{\epsilon m^{2}}{\mathbf{y}^{2}}\right)\Upsilon\left(\frac{\epsilon m^{2}}{\mathbf{y}^{2}}\right)-2\Psi_{1}\eta_{1}(\mathbf{y}/4)m_{\epsilon-1}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon-1}^{2}\right)
Υ(m/4)(14mΥ𝐲2)exp(m22𝐲2)Ψ1mϵϕ1exp(12mϵ2)(ϵm2𝐲2)2Ψ1η1(𝐲/4)mϵ1φ1exp(12mϵ12).\displaystyle\geq\Upsilon(m/4)\left(1-\frac{4m}{\Upsilon\mathbf{y}^{2}}\right)\exp\left(\frac{-m^{2}}{2\mathbf{y}^{2}}\right)\Psi_{1}m_{\epsilon}^{\phi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon}^{2}\right)\left(\frac{\epsilon m^{2}}{\mathbf{y}^{2}}\right)-2\Psi_{1}\eta_{1}(\mathbf{y}/4)m_{\epsilon-1}^{\varphi_{1}}\exp\left(-\frac{1}{2}m_{\epsilon-1}^{2}\right).

Therefore,

I\displaystyle I (1+η0(𝐲/4))Υ(3m/4)Ψ0Ψ1𝐲/4mϵ3𝐲/4(𝐲mϵ)ϕ0exp((𝐲mϵ)22)mϵϕ1exp(mϵ122)(ϵm2𝐲2)\displaystyle\leq\left(1+\eta_{0}(\mathbf{y}/4)\right)\Upsilon(3m/4)\Psi_{0}\Psi_{1}\sum_{\mathbf{y}/4\leq m_{\epsilon}\leq 3\mathbf{y}/4}(\mathbf{y}-m_{\epsilon})^{\phi_{0}}\exp\left(-\frac{(\mathbf{y}-m_{\epsilon})^{2}}{2}\right)m_{\epsilon}^{\phi_{1}}\exp\left(-\frac{m_{\epsilon-1}^{2}}{2}\right)\left(\frac{\epsilon m^{2}}{\mathbf{y}^{2}}\right)
+2(1+η0(𝐲/4))η1(𝐲/4)Ψ0Ψ1𝐲/4mϵ3𝐲/4(𝐲mϵ)ϕ0exp((𝐲mϵ)22)mϵϕ1exp(mϵ122),\displaystyle+2\left(1+\eta_{0}(\mathbf{y}/4)\right)\eta_{1}(\mathbf{y}/4)\Psi_{0}\Psi_{1}\sum_{\mathbf{y}/4\leq m_{\epsilon}\leq 3\mathbf{y}/4}(\mathbf{y}-m_{\epsilon})^{\phi_{0}}\exp\left(-\frac{(\mathbf{y}-m_{\epsilon})^{2}}{2}\right)m_{\epsilon}^{\phi_{1}}\exp\left(-\frac{m_{\epsilon-1}^{2}}{2}\right), (15)

and

I\displaystyle I (1+η0(𝐲/4))Υ(m/4)Ψ0Ψ1(14m3𝐲2)ϕ1exp(m2𝐲2)\displaystyle\geq\left(1+\eta_{0}(\mathbf{y}/4)\right)\Upsilon(m/4)\Psi_{0}\Psi_{1}\left(1-\frac{4m}{3\mathbf{y}^{2}}\right)^{\phi_{1}}\exp\left(-\frac{m}{2\mathbf{y}^{2}}\right)
×𝐲/4mϵ3𝐲/4(𝐲mϵ)ϕ0exp((𝐲mϵ)22)mϵϕ1exp(mϵ122)(ϵm2𝐲2)\displaystyle\times\sum_{\mathbf{y}/4\leq m_{\epsilon}\leq 3\mathbf{y}/4}(\mathbf{y}-m_{\epsilon})^{\phi_{0}}\exp\left(-\frac{(\mathbf{y}-m_{\epsilon})^{2}}{2}\right)m_{\epsilon}^{\phi_{1}}\exp\left(-\frac{m_{\epsilon-1}^{2}}{2}\right)\left(\frac{\epsilon m^{2}}{\mathbf{y}^{2}}\right)
2(1+η0(𝐲/4))η1(𝐲/4)Ψ0Ψ1𝐲/4mϵ3𝐲/4(𝐲mϵ)ϕ0exp((𝐲mϵ)22)mϵϕ1exp(mϵ122).\displaystyle-2\left(1+\eta_{0}(\mathbf{y}/4)\right)\eta_{1}(\mathbf{y}/4)\Psi_{0}\Psi_{1}\sum_{\mathbf{y}/4\leq m_{\epsilon}\leq 3\mathbf{y}/4}(\mathbf{y}-m_{\epsilon})^{\phi_{0}}\exp\left(-\frac{(\mathbf{y}-m_{\epsilon})^{2}}{2}\right)m_{\epsilon}^{\phi_{1}}\exp\left(-\frac{m_{\epsilon-1}^{2}}{2}\right). (16)

Define mϵ:=mϵ/𝐲=ϵm/𝐲2m_{\epsilon}^{\prime}:=m_{\epsilon}/\mathbf{y}=\epsilon m/\mathbf{y}^{2}. First sum of the right hand side of Equation (Proof of Lemma 2) yields,

I\displaystyle I^{\prime} =𝐲ϕ0+ϕ1+2exp(𝐲2/2)1/4mϵ3/4(1mϵ)ϕ0mϵϕ1exp(𝐲2(mϵmϵ2)mϵ)(m𝐲2)\displaystyle=\mathbf{y}^{\phi_{0}+\phi_{1}+2}\exp(-\mathbf{y}^{2}/2)\sum_{1/4\leq m_{\epsilon}^{\prime}\leq 3/4}(1-m_{\epsilon}^{\prime})^{\phi_{0}}m_{\epsilon}^{\prime\phi_{1}}\exp\left(\mathbf{y}^{2}(m_{\epsilon}^{\prime}-m_{\epsilon}^{\prime 2})m_{\epsilon}^{\prime}\right)\left(\frac{m}{\mathbf{y}^{2}}\right)
=𝐲ϕ0+ϕ1+2exp(𝐲2/2)1/4mϵ1/4[12(mϵ12)]ϕ0[12+(mϵ12)]ϕ1exp{𝐲24𝐲2(12mϵ)2}mϵm𝐲2\displaystyle=\mathbf{y}^{\phi_{0}+\phi_{1}+2}\exp(-\mathbf{y}^{2}/2)\sum_{-1/4\leq m_{\epsilon}^{\prime}\leq 1/4}\left[\frac{1}{2}-\left(m_{\epsilon}^{\prime}-\frac{1}{2}\right)\right]^{\phi_{0}}\left[\frac{1}{2}+\left(m_{\epsilon}^{\prime}-\frac{1}{2}\right)\right]^{\phi_{1}}\exp\left\{\frac{\mathbf{y}^{2}}{4}-\mathbf{y}^{2}\left(\frac{1}{2}-m_{\epsilon}^{\prime}\right)^{2}\right\}m_{\epsilon}^{\prime}\frac{m}{\mathbf{y}^{2}}
2𝐲ϕ0+ϕ1+2exp(𝐲2/4)0mϵ(1/2)1/4[12(mϵ12)]ϕ0[12+(mϵ12)]ϕ1exp{𝐲2(12mϵ)2}mϵm𝐲2\displaystyle\leq 2\mathbf{y}^{\phi_{0}+\phi_{1}+2}\exp(-\mathbf{y}^{2}/4)\sum_{0\leq m_{\epsilon}^{\prime}-(1/2)\leq 1/4}\left[\frac{1}{2}-\left(m_{\epsilon}^{\prime}-\frac{1}{2}\right)\right]^{\phi_{0}}\left[\frac{1}{2}+\left(m_{\epsilon}^{\prime}-\frac{1}{2}\right)\right]^{\phi_{1}}\exp\left\{-\mathbf{y}^{2}\left(\frac{1}{2}-m_{\epsilon}^{\prime}\right)^{2}\right\}m_{\epsilon}^{\prime}\frac{m}{\mathbf{y}^{2}}
2𝐲ϕ0+ϕ1+2exp(𝐲2/4)1𝐲01/4(12u+m𝐲2)ϕ0(12+u+m𝐲2)ϕ1exp(𝐲2u2)𝑑u\displaystyle\leq 2\mathbf{y}^{\phi_{0}+\phi_{1}+2}\exp(-\mathbf{y}^{2}/4)\frac{1}{\mathbf{y}}\int_{0}^{1/4}\left(\frac{1}{2}-u+\frac{m}{\mathbf{y}^{2}}\right)^{\phi_{0}}\left(\frac{1}{2}+u+\frac{m}{\mathbf{y}^{2}}\right)^{\phi_{1}}\exp(-\mathbf{y}^{2}u^{2})du
=20exp(v2)𝑑v2(ϕ0+ϕ1)𝐲ϕ0+ϕ1+1exp(𝐲2/4)(1+o(1)),\displaystyle=2\int_{0}^{\infty}\exp(-v^{2})dv2^{-(\phi_{0}+\phi_{1})}\mathbf{y}^{\phi_{0}+\phi_{1}+1}\exp(-\mathbf{y}^{2}/4)(1+o(1)),

as 𝐲\mathbf{y}\rightarrow\infty. The last inequality is obtained by using the monotone convergence theorem. The estimate from below for the first sum on the right hand side of condition (Proof of Lemma 2) becomes,

I^\displaystyle\hat{I}^{\prime} =𝐲ϕ0+ϕ1+2exp(𝐲2/2)1/4mϵ3/4(1mϵ)ϕ0mϵϕ1exp(𝐲2(mϵmϵ2m/𝐲2))mϵ(m𝐲2)\displaystyle=\mathbf{y}^{\phi_{0}+\phi_{1}+2}\exp(-\mathbf{y}^{2}/2)\sum_{1/4\leq m_{\epsilon}^{\prime}\leq 3/4}(1-m_{\epsilon}^{\prime})^{\phi_{0}}m_{\epsilon}^{\prime\phi_{1}}\exp\left(\mathbf{y}^{2}(m_{\epsilon}^{\prime}-m_{\epsilon}^{\prime 2}-m/\mathbf{y}^{2})^{\prime}\right)m_{\epsilon}\left(\frac{m}{\mathbf{y}^{2}}\right)
=exp(m)yϕ0+ϕ1+2exp(𝐲2/2)1/4mϵ(1/2)1/4[12(mϵ112)]ϕ0[12+(mϵ12)]ϕ1\displaystyle=\exp(-m)y^{\phi_{0}+\phi_{1}+2}\exp(-\mathbf{y}^{2}/2)\sum_{-1/4\leq m_{\epsilon}^{\prime}-(1/2)\leq 1/4}\left[\frac{1}{2}-\left(m_{\epsilon-1}^{\prime}-\frac{1}{2}\right)\right]^{\phi_{0}}\left[\frac{1}{2}+\left(m_{\epsilon}^{\prime}-\frac{1}{2}\right)\right]^{\phi_{1}}
×exp{𝐲24𝐲2(12mϵ)2}mϵm𝐲2\displaystyle\hskip 85.35826pt\times\exp\left\{\frac{\mathbf{y}^{2}}{4}-\mathbf{y}^{2}\left(\frac{1}{2}-m_{\epsilon}^{\prime}\right)^{2}\right\}m_{\epsilon}^{\prime}\frac{m}{\mathbf{y}^{2}}
2exp(m)yϕ0+ϕ1+2exp(𝐲2/4)0mϵ(1/2)1/4[12(mϵ12)m𝐲2]ϕ0[12+(mϵ12)m𝐲2]ϕ1\displaystyle\geq 2\exp(-m)y^{\phi_{0}+\phi_{1}+2}\exp(-\mathbf{y}^{2}/4)\sum_{0\leq m_{\epsilon}^{\prime}-(1/2)\leq 1/4}\left[\frac{1}{2}-\left(m_{\epsilon}^{\prime}-\frac{1}{2}\right)-\frac{m}{\mathbf{y}^{2}}\right]^{\phi_{0}}\left[\frac{1}{2}+\left(m_{\epsilon}^{\prime}-\frac{1}{2}\right)-\frac{m}{\mathbf{y}^{2}}\right]^{\phi_{1}}
×exp{𝐲2(12mϵ)2}mϵ1m𝐲2\displaystyle\hskip 85.35826pt\times\exp\left\{-\mathbf{y}^{2}\left(\frac{1}{2}-m_{\epsilon}^{\prime}\right)^{2}\right\}m_{\epsilon-1}^{\prime}\frac{m}{\mathbf{y}^{2}}
2exp(m)yϕ0+ϕ1+2exp(𝐲2/4)01/4(12u+m𝐲2)ϕ0(12+u+m𝐲2)ϕ1exp(𝐲2u2)𝑑u\displaystyle\geq 2\exp(-m)y^{\phi_{0}+\phi_{1}+2}\exp(-\mathbf{y}^{2}/4)\int_{0}^{1/4}\left(\frac{1}{2}-u+\frac{m}{\mathbf{y}^{2}}\right)^{\phi_{0}}\left(\frac{1}{2}+u+\frac{m}{\mathbf{y}^{2}}\right)^{\phi_{1}}\exp(-\mathbf{y}^{2}u^{2})du
=20exp(v2)𝑑v2(ϕ0+ϕ1)𝐲ϕ0+ϕ1+1exp(𝐲2/4)(1+o(1)),\displaystyle=2\int_{0}^{\infty}\exp(-v^{2})dv2^{-(\phi_{0}+\phi_{1})}\mathbf{y}^{\phi_{0}+\phi_{1}+1}\exp(-\mathbf{y}^{2}/4)(1+o(1)),

as 𝐲\mathbf{y}\rightarrow\infty. Now consider the second sums I"I" and I^"\hat{I}", on the right hand side of condition (Proof of Lemma 2). For all ϵ\epsilon, ϵ\epsilon-th summand in those sums is determined by the corresponding summand in the first sum by multiplying by 𝐲2/(ϵm2)4/m\mathbf{y}^{2}/(\epsilon m^{2})\leq 4/m. Dividing left hand side and right hand sides of condition (Proof of Lemma 2) and (Proof of Lemma 2) by

V(𝐲)=2(ϕ0+ϕ1)πΨ0Ψ1𝐲ϕ0+ϕ1+1exp(𝐲2/4)V(\mathbf{y})=2^{-(\phi_{0}+\phi_{1})}\sqrt{\pi}\Psi_{0}\Psi_{1}\mathbf{y}^{\phi_{0}+\phi_{1}+1}\exp(-\mathbf{y}^{2}/4)

and letting 𝐲\mathbf{y}\rightarrow\infty yields,

Υ(4m/3)exp(m)lim inf𝐲IV(𝐲)lim sup𝐲IV(𝐲)1.\Upsilon(4m/3)\exp(-m)\leq\liminf_{\mathbf{y}\rightarrow\infty}\frac{I}{V(\mathbf{y})}\leq\limsup_{\mathbf{y}\rightarrow\infty}\frac{I}{V(\mathbf{y})}\leq 1.

For an arbitrary mm, and definition of Υ\Upsilon, the above inequality shows the asymptotic behavior of I. Therefore, the statement of the lemma follows by condition (Proof of Lemma 2). This completes the proof.

Proof of Proposition 1

In order to prove this proposition we would go along the line of Ghosh and Ramamoorthi, (2010). Since τ=n(ΘkΘkn)\tau=\sqrt{n}(\Theta_{k}-\Theta_{k}^{n}), under model 𝔐k[.]\mathfrak{M}_{k}^{[.]} we can write,

p(τ,𝔐k[.]|𝐲)\displaystyle p\left(\tau,\mathfrak{M}_{k}^{[.]}\big{|}\mathbf{y}\right) =f[.](𝐲|Θkn+τn)π(Θk+τn|𝔐k[.])f[.](𝐲|Θkn+un)π(Θk+un|𝔐k[.])𝑑u\displaystyle=\frac{f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{\tau}{\sqrt{n}}\right)\pi\left(\Theta_{k}+\frac{\tau}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)}{\int_{\mathbb{R}}f^{[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{u}{\sqrt{n}}\right)\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)du}
=π(Θk+τn|𝔐k[.])exp{fn[.](𝐲|Θkn+τn,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}π(Θk+un|𝔐k[.])exp{fn[.](𝐲|Θkn+un,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}𝑑u.\displaystyle=\frac{\pi\left(\Theta_{k}+\frac{\tau}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{\tau}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}}{\int_{\mathbb{R}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{u}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}du}.

It is sufficient to show that

|π(Θk+τn|𝔐k[.])exp{fn[.](𝐲|Θkn+τn,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}π(Θk+un|𝔐k[.])exp{fn[.](𝐲|Θkn+un,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}𝑑u(Θ¯k)2πexp{12τ2(Θ¯k)}|dτ𝒫Θ¯k0,\int_{\mathbb{R}}\left|\frac{\pi\left(\Theta_{k}+\frac{\tau}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{\tau}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}}{\int_{\mathbb{R}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{u}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}du}\right.\\ \left.-\sqrt{\frac{\mathcal{I}\left(\bar{\Theta}_{k}\right)}{2\pi}}\exp\left\{-\frac{1}{2}\tau^{2}\mathcal{I}\left(\bar{\Theta}_{k}\right)\right\}\right|d\tau\overset{\mathcal{P}_{\bar{\Theta}_{k}}}{\to}0, (17)

or,

|π(Θk+un|𝔐k[.])exp{fn[.](𝐲|Θkn+un,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}π(Θ¯k|𝔐k[.])exp{12u2(Θ¯k)}|du𝒫Θ¯k0.\int_{\mathbb{R}}\left|\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{u}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}\right.\\ \left.-\pi\left(\bar{\Theta}_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}u^{2}\mathcal{I}\left(\bar{\Theta}_{k}\right)\right\}\right|du\overset{\mathcal{P}_{\bar{\Theta}_{k}}}{\to}0. (18)

In order to understand conditions 17 and 18 define

Υn:=π(Θk+un|𝔐k[.])exp{fn[.](𝐲|Θkn+un,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}𝑑u.\Upsilon_{n}:=\int_{\mathbb{R}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{u}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}du.

Thus, expression in condition 17 becomes

1Υn[|π(Θk+τn|𝔐k[.])exp{fn[.](𝐲|Θkn+τn,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}Υn(Θ¯k)2πexp{12τ2(Θ¯k)}|dτ]𝒫Θ¯k0.\frac{1}{\Upsilon_{n}}\left[\int_{\mathbb{R}}\left|\pi\left(\Theta_{k}+\frac{\tau}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{\tau}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}\right.\right.\\ \left.\left.-\Upsilon_{n}\sqrt{\frac{\mathcal{I}\left(\bar{\Theta}_{k}\right)}{2\pi}}\exp\left\{-\frac{1}{2}\tau^{2}\mathcal{I}\left(\bar{\Theta}_{k}\right)\right\}\right|d\tau\right]\overset{\mathcal{P}_{\bar{\Theta}_{k}}}{\to}0.

Let us denote two integrals as

I0:=\displaystyle I_{0}:= |π(Θk+τn|𝔐k[.])exp{fn[.](𝐲|Θkn+τn,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}π(Θ¯k|𝔐k[.])exp{12τ2(Θ¯k)}|dτ\displaystyle\int_{\mathbb{R}}\left|\pi\left(\Theta_{k}+\frac{\tau}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{\tau}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}-\pi\left(\bar{\Theta}_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}\tau^{2}\mathcal{I}\left(\bar{\Theta}_{k}\right)\right\}\right|d\tau
I1\displaystyle I_{1} :=|π(Θ¯k|𝔐k[.])exp{12τ2(Θ¯k)}Υn(Θ¯k)2πexp{12τ2(Θ¯k)}|dτ.\displaystyle:=\int_{\mathbb{R}}\left|\pi\left(\bar{\Theta}_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}\tau^{2}\mathcal{I}\left(\bar{\Theta}_{k}\right)\right\}-\Upsilon_{n}\sqrt{\frac{\mathcal{I}\left(\bar{\Theta}_{k}\right)}{2\pi}}\exp\left\{-\frac{1}{2}\tau^{2}\mathcal{I}\left(\bar{\Theta}_{k}\right)\right\}\right|d\tau.

Since condition 18 implies Υnπ(Θ¯k|𝔐k[.])2π/I(Θ¯k)\Upsilon_{n}\rightarrow\pi\left(\bar{\Theta}_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right)\sqrt{2\pi/I(\bar{\Theta}_{k})}, it is sufficient to show that integral inside the parenthesis converges to zero in probability and this term is less than I0+I1I_{0}+I_{1}. Now by condition 18 I00I_{0}\rightarrow 0 and the expression in I1I_{1},

I1=|π(Θ¯k|𝔐k[.])Υn(Θ¯k)2π|exp{12τ2(Θ¯k)}dτ0,I_{1}=\left|\pi\left(\bar{\Theta}_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right)-\Upsilon_{n}\sqrt{\frac{\mathcal{I}\left(\bar{\Theta}_{k}\right)}{2\pi}}\right|\int_{\mathbb{R}}\exp\left\{-\frac{1}{2}\tau^{2}\mathcal{I}\left(\bar{\Theta}_{k}\right)\right\}d\tau\rightarrow 0,

as Υnπ(Θ¯k|𝔐k[.])2π/I(Θ¯k)\Upsilon_{n}\rightarrow\pi\left(\bar{\Theta}_{k}\big{|}\mathfrak{M}_{k}^{[.]}\right)\sqrt{2\pi/I(\bar{\Theta}_{k})}. For further simplicity of this problem, define a function

gn:=1ni=1nf2[.](yi|Θkn,𝔐k[.])=1nf2[.](yi|Θkn,𝔐k[.]).g_{n}:=-\frac{1}{n}\sum_{i=1}^{n}f_{2}^{[.]}\left(y_{i}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)=-\frac{1}{n}f_{2}^{[.]}\left(y_{i}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right).

Clearly gnI(Θk)g_{n}\rightarrow I(\Theta_{k}) almost surely in probability 𝒫Θk\mathcal{P}_{\Theta_{k}} as nn\rightarrow\infty. To check condition 18 it is sufficient to show that

|π(Θk+un|𝔐k[.])exp{fn[.](𝐲|Θkn+un,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}π(Θkn|𝔐k[.])exp{12u2gn}|du𝒫Θ¯k0.\int_{\mathbb{R}}\left|\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{u}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}\right.\\ \left.-\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}\right|du\overset{\mathcal{P}_{\bar{\Theta}_{k}}}{\to}0. (19)

For any δ,ϰ>0\delta,\varkappa>0 let us break \mathbb{R} into three regions so that 𝒞1={u:|u|<ϰlnn}\mathcal{C}_{1}=\{u:|u|<\varkappa\ln\sqrt{n}\}, 𝒞2={u:ϰlnn<|u|<δn}\mathcal{C}_{2}=\{u:\varkappa\ln\sqrt{n}<|u|<\delta\sqrt{n}\} and 𝒞3={u:|u|>δn}\mathcal{C}_{3}=\{u:|u|>\delta\sqrt{n}\}. For the region 𝒞3\mathcal{C}_{3},

𝒞3|π(Θk+un|𝔐k[.])exp{fn[.](𝐲|Θkn+un,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}π(Θkn|𝔐k[.])exp{12u2gn}|du𝒞3π(Θk+un|𝔐k[.])exp{fn[.](𝐲|Θkn+un,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])}𝑑u+𝒞3π(Θkn|𝔐k[.])exp{12u2gn}𝑑u.\int_{\mathcal{C}_{3}}\left|\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{u}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}-\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}\right|du\\ \leq\int_{\mathcal{C}_{3}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{u}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)\right\}du+\int_{\mathcal{C}_{3}}\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}du.

By Assumption (v) in 1 the first integral goes to zero and by the tail estimate of a normal distribution the second integral converges to zero (Ghosh and Ramamoorthi,, 2010). Since, ΘknΘk\Theta_{k}^{n}\rightarrow\Theta_{k} for nn\rightarrow\infty, then a Taylor series expansion yields,

fn[.](𝐲|Θkn+un,𝔐k[.])fn[.](𝐲|Θkn,𝔐k[.])=u22nf2n[.](𝐲|Θkn,𝔐k[.])+16(un)3f3n[.](𝐲|Θk,𝔐k[.])=u2gn2+Rn,f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n}+\frac{u}{\sqrt{n}},\mathfrak{M}_{k}^{[.]}\right)-f^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)=\frac{u^{2}}{2n}f_{2}^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{n},\mathfrak{M}_{k}^{[.]}\right)+\frac{1}{6}\left(\frac{u}{\sqrt{n}}\right)^{3}f_{3}^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{*},\mathfrak{M}_{k}^{[.]}\right)=-\frac{u^{2}g_{n}}{2}+R_{n},

where Θk(Θ¯k,Θkn)\Theta_{k}^{*}\in(\bar{\Theta}_{k},\Theta_{k}^{n}). Now for region 𝒞1\mathcal{C}_{1}

𝒞1|π(Θk+un|𝔐k[.])exp{u2gn2+Rn}π(Θkn|𝔐k[.])exp{12u2gn}|du𝒞1π(Θk+un|𝔐k[.])|exp{u2gn2+Rn}exp{12u2gn}|𝑑u+𝒞1|π(Θk+un|𝔐k[.])π(Θkn|𝔐k[.])|exp{12u2gn}du.\int_{\mathcal{C}_{1}}\left|\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{u^{2}g_{n}}{2}+R_{n}\right\}-\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}\right|du\\ \leq\int_{\mathcal{C}_{1}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\left|\exp\left\{-\frac{u^{2}g_{n}}{2}+R_{n}\right\}-\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}\right|du\\ +\int_{\mathcal{C}_{1}}\left|\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)-\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\right|\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}du.

Since the prior density π(.|.)\pi(.|.) is continuous at Θ¯k\bar{\Theta}_{k}, the second integral converges to zero a.s. in probability 𝒫Θ¯k\mathcal{P}_{\bar{\Theta}_{k}}. The first integral of the above expression is,

𝒞1π(Θk+un|𝔐k[.])|exp{Rn}1|exp{12u2gn}𝑑u𝒞1π(Θk+un|𝔐k[.])|Rn|exp|Rn|exp{12u2gn}du.\int_{\mathcal{C}_{1}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\left|\exp\{R_{n}\}-1\right|\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}du\leq\int_{\mathcal{C}_{1}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)|R_{n}|\exp|R_{n}|\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}du. (20)

Since

supu𝒞1Rn=supu𝒞1(un)3f3n[.](𝐲|Θk,𝔐k[.])ϰ3n(lnn)3OP(1)=oP(1),\sup_{u\in\mathcal{C}_{1}}R_{n}=\sup_{u\in\mathcal{C}_{1}}\left(\frac{u}{\sqrt{n}}\right)^{3}f_{3}^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{*},\mathfrak{M}_{k}^{[.]}\right)\leq\frac{\varkappa^{3}}{n}(\ln\sqrt{n})^{3}O_{P}(1)=o_{P}(1),

the condition 20 satisfies

𝒞1π(Θk+un|𝔐k[.])|Rn|exp|Rn|exp{12u2gn}duπ(Θk+un|𝔐k[.])𝒞1exp{12u2gn}|Rn|exp|Rn|du=oP(1).\int_{\mathcal{C}_{1}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)|R_{n}|\exp|R_{n}|\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}du\\ \leq\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\int_{\mathcal{C}_{1}}\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}|R_{n}|\exp|R_{n}|du=o_{P}(1).

Finally, for the region 𝒞2\mathcal{C}_{2},

𝒞2|π(Θk+un|𝔐k[.])exp{u2gn2+Rn}π(Θkn|𝔐k[.])exp{12u2gn}|du𝒞2π(Θk+un|𝔐k[.])exp{u2gn2+Rn}𝑑u+𝒞2π(Θkn|𝔐k[.])exp{12u2gn}𝑑u.\int_{\mathcal{C}_{2}}\left|\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{u^{2}g_{n}}{2}+R_{n}\right\}-\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}\right|du\\ \leq\int_{\mathcal{C}_{2}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{u^{2}g_{n}}{2}+R_{n}\right\}du+\int_{\mathcal{C}_{2}}\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}du.

For a large constant C(0,)C^{*}\in(0,\infty) the second integral of the above inequality satisfies

𝒞2π(Θkn|𝔐k[.])exp{12u2gn}𝑑u2π(Θkn|𝔐k[.])exp{12ϰgnlnn}[δnϰlnn]Cπ(Θkn|𝔐k[.])nnϰgn/4𝒫Θ¯k0.\int_{\mathcal{C}_{2}}\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}u^{2}g_{n}\right\}du\leq 2\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{1}{2}\varkappa g_{n}\ln\sqrt{n}\right\}[\delta\sqrt{n}-\varkappa\ln\sqrt{n}]\\ \leq C^{*}\pi\left(\Theta_{k}^{n}\big{|}\mathfrak{M}_{k}^{[.]}\right)\frac{\sqrt{n}}{n^{\varkappa g_{n}/4}}\overset{\mathcal{P}_{\bar{\Theta}_{k}}}{\to}0.

Since u𝒞2u\in\mathcal{C}_{2} and ϰlnn<|u|<δn\varkappa\ln\sqrt{n}<|u|<\delta\sqrt{n}, first integral yields |u|/n<δ|u|/\sqrt{n}<\delta. Therefore,

|Rn|=16(un)3f3n[.](𝐲|Θk,𝔐k[.])δu26nf3n[.](𝐲|Θk,𝔐k[.]).|R_{n}|=\frac{1}{6}\left(\frac{u}{\sqrt{n}}\right)^{3}f_{3}^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{*},\mathfrak{M}_{k}^{[.]}\right)\leq\frac{\delta u^{2}}{6n}f_{3}^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{*},\mathfrak{M}_{k}^{[.]}\right).

Small values of δ>0\delta>0 ensures

𝒫Θ¯k{|Rn|<u24gn,u𝒞2}>1ϵ,for n>η,\mathcal{P}_{\bar{\Theta}_{k}}\left\{|R_{n}|<\frac{u^{2}}{4}g_{n},\ \forall u\in\mathcal{C}_{2}\right\}>1-\epsilon,\ \text{for $n>\eta^{*}$}, (21)

as supΘk(Θ¯kδ,Θ¯k+δ)(1/n)|f3n[.](𝐲|Θk,𝔐k[.])|=OP(1)\sup_{\Theta_{k}^{*}\in(\bar{\Theta}_{k}-\delta,\bar{\Theta}_{k}+\delta)}(1/n)\left|f_{3}^{n[.]}\left(\mathbf{y}\big{|}\Theta_{k}^{*},\mathfrak{M}_{k}^{[.]}\right)\right|=O_{P}(1). The condition 21 can be written as,

𝒫Θ¯k{u22gn+Rn<u24gn,u𝒞2}>1ϵ.\mathcal{P}_{\bar{\Theta}_{k}}\left\{-\frac{u^{2}}{2}g_{n}+R_{n}<-\frac{u^{2}}{4}g_{n},\ \forall u\in\mathcal{C}_{2}\right\}>1-\epsilon. (22)

Therefore, with probability greater than 1ϵ1-\epsilon,

𝒞2π(Θk+un|𝔐k[.])exp{u2gn2+Rn}𝑑usupΘk𝒞2π(Θk+un|𝔐k[.])𝒞2exp{u2gn2+Rn}𝑑u0,\int_{\mathcal{C}_{2}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\exp\left\{-\frac{u^{2}g_{n}}{2}+R_{n}\right\}du\leq\sup_{\Theta_{k}\in\mathcal{C}_{2}}\pi\left(\Theta_{k}+\frac{u}{\sqrt{n}}\big{|}\mathfrak{M}_{k}^{[.]}\right)\int_{\mathcal{C}_{2}}\exp\left\{-\frac{u^{2}g_{n}}{2}+R_{n}\right\}du\rightarrow 0,

as nn\rightarrow\infty. Now first choosing a δ\delta to ensure condition 21 and then by working with the δ\delta in first and second steps yields the final expression. This completes the proof.

Derivation of Equation 9

Note that the probability mass function (pmf) of the zero-truncated negative binomial random variable YY is

fTNB(y|κ)=Γ(y+γ)y!Γ(γ)(1+κγ)γ(1+γκ)y1(1+κγ)γ,y=1,2,.f_{T}^{NB}(y|\kappa)=\frac{\frac{\Gamma(y+\gamma)}{y!\Gamma(\gamma)}\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\left(1+\frac{\gamma}{\kappa}\right)^{-y}}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}},\ \ y=1,2,....

The expected value of YY is

𝔼(Y)=γ2κ[1(1+κγ)γ].\mathbb{E}(Y)=\frac{\gamma^{2}}{\kappa\left[1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right]}.

Note that

ln[fTNB(y|κ)]=ln[Γ(y+γ)y!Γ(γ)]yln(1+γκ)γln[1(1+κγ)γ].\ln\left[f_{T}^{NB}(y|\kappa)\right]=\ln\left[\frac{\Gamma(y+\gamma)}{y!\Gamma(\gamma)}\right]-y\ln\left(1+\frac{\gamma}{\kappa}\right)-\gamma\ln\left[1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right].

Since

κfTNB(y|κ)=γyκ2(1+γκ)11+κγ(1+κγ)(1+γ)1(1+κγ)γ,\frac{\partial}{\partial\kappa}f_{T}^{NB}(y|\kappa)=\frac{\gamma y}{\kappa^{2}\left(1+\frac{\gamma}{\kappa}\right)}-\frac{1}{1+\frac{\kappa}{\gamma}}-\frac{\left(1+\frac{\kappa}{\gamma}\right)^{-(1+\gamma)}}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}},

and

2κ2fTNB(y|κ)\displaystyle\frac{\partial^{2}}{\partial\kappa^{2}}f_{T}^{NB}(y|\kappa) =γ2yκ4(1+γκ)2+1γ(1+κγ)22γyκ3(1+γκ)(1+1γ)(1+κγ)(2+γ)1(1+κγ)γ(1+κγ)(2+γ)[1(1+κγ)γ]2,\displaystyle=\frac{\gamma^{2}y}{\kappa^{4}\left(1+\frac{\gamma}{\kappa}\right)^{2}}+\frac{1}{\gamma\left(1+\frac{\kappa}{\gamma}\right)^{2}}-\frac{2\gamma y}{\kappa^{3}\left(1+\frac{\gamma}{\kappa}\right)}-\frac{\left(1+\frac{1}{\gamma}\right)\left(1+\frac{\kappa}{\gamma}\right)^{-(2+\gamma)}}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}}-\frac{\left(1+\frac{\kappa}{\gamma}\right)^{-(2+\gamma)}}{\left[1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right]^{2}},

we have the Fisher information matrix as

I(κ)\displaystyle I(\kappa) =𝔼[2κ2fTNB(y|κ)]\displaystyle=-\mathbb{E}\left[\frac{\partial^{2}}{\partial\kappa^{2}}f_{T}^{NB}(y|\kappa)\right]
=(γκ)31(γ+κ)[1(1+κγ)γ][21(γ+κ)[1(1+κγ)γ]]γ(γ+κ)2\displaystyle=\left(\frac{\gamma}{\kappa}\right)^{3}\frac{1}{(\gamma+\kappa)\left[1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right]}\left[2-\frac{1}{(\gamma+\kappa)\left[1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right]}\right]-\frac{\gamma}{(\gamma+\kappa)^{2}}
+(1+κγ)(2+γ)1(1+κγ)γ[1γ(1+γ)+11(1+κγ)γ]\displaystyle\hskip 56.9055pt+\frac{\left(1+\frac{\kappa}{\gamma}\right)^{-(2+\gamma)}}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}}\left[\frac{1}{\gamma}(1+\gamma)+\frac{1}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}}\right]
=γκ(γ+κ){(γκ)211(1+κγ)γ[21(γ+κ)[1(1+κγ)γ]]κγ+κ\displaystyle=\frac{\gamma}{\kappa(\gamma+\kappa)}\left\{\left(\frac{\gamma}{\kappa}\right)^{2}\frac{1}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}}\left[2-\frac{1}{(\gamma+\kappa)\left[1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}\right]}\right]-\frac{\kappa}{\gamma+\kappa}\right.
+[κ(γ+κ)γ](1+κγ)(2+γ)1(1+κγ)γ[1γ(1+γ)+11(1+κγ)γ]}.\displaystyle\left.\hskip 56.9055pt+\left[\frac{\kappa(\gamma+\kappa)}{\gamma}\right]\frac{\left(1+\frac{\kappa}{\gamma}\right)^{-(2+\gamma)}}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}}\left[\frac{1}{\gamma}(1+\gamma)+\frac{1}{1-\left(1+\frac{\kappa}{\gamma}\right)^{-\gamma}}\right]\right\}.

Therefore, the Jeffreys prior is readily available by taking a squared root on I(κ)I(\kappa).

References

  • Agarwal et al., (2002) Agarwal, D. K., Gelfand, A. E., and Citron-Pousty, S. (2002). Zero-inflated models with application to spatial count data. Environmental and Ecological statistics, 9(4):341–355.
  • Akaike, (1998) Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle. In Selected Papers of Hirotugu Akaike, pages 199–213. Springer.
  • Altaweel et al., (2019) Altaweel, A., Stoleru, R., Gu, G., and Maity, A. K. (2019). Collusivehijack: A new route hijacking attack and countermeasures in opportunistic networks. In 2019 IEEE Conference on Communications and Network Security (CNS), pages 73–81. IEEE.
  • Altaweel et al., (2022) Altaweel, A., Stoleru, R., Gu, G., Maity, A. K., and Bhunia, S. (2022). On detecting route hijacking attack in opportunistic mobile networks. IEEE Transactions on Dependable and Secure Computing.
  • Bayarri et al., (2008) Bayarri, M., Berger, J. O., Datta, G. S., et al. (2008). Objective bayes testing of poisson versus inflated poisson models. Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, 3:105–121.
  • Beck et al., (2023) Beck, J. T. T., McKean, M., Gadgeel, S. M., Bowles, D. W., Haq, R., Yaeger, R., Taylor, M. H., Maity, A. K., Drescher, S., Oliver, C., et al. (2023). A phase 1, open-label, dose escalation and dose expansion study to evaluate the safety, tolerability, pharmacokinetics, and antitumor activity of pf-07799933 (arry-440) as a single agent and in combination therapy in participants 16 years and older with advanced solid tumors with braf alterations.
  • Benson et al., (2010) Benson, A. K., Kelly, S. A., Legge, R., Ma, F., Low, S. J., Kim, J., Zhang, M., Oh, P. L., Nehrenberg, D., Hua, K., et al. (2010). Individuality in gut microbiota composition is a complex polygenic trait shaped by multiple environmental and host genetic factors. Proceedings of the National Academy of Sciences, 107(44):18933–18938.
  • Berger, (2006) Berger, J. (2006). The case for objective bayesian analysis. Bayesian analysis, 1(3):385–402.
  • Bhattacharya et al., (2008) Bhattacharya, A., Clarke, B. S., Datta, G. S., et al. (2008). A bayesian test for excess zeros in a zero-inflated power series distribution. IMS collections, Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K Sen, 1:89–104.
  • Böhning et al., (1999) Böhning, D., Dietz, E., Schlattmann, P., Mendonca, L., and Kirchner, U. (1999). The zero-inflated poisson model and the decayed, missing and filled teeth index in dental epidemiology. Journal of the Royal Statistical Society: Series A (Statistics in Society), 162(2):195–209.
  • Bradlow et al., (2002) Bradlow, E. T., Hardie, B. G. S., and Fader, P. S. (2002). Bayesian inference for the negative binomial distribution via polynomial expansions. Journal of Computational and Graphical Statistics, 11(1):189–201.
  • Burrell, (1990) Burrell, Q. L. (1990). Using the gamma-poisson model to predict library circulations. Journal of the American Society for Information Science, 41(3):164–170.
  • Calvo et al., (2023) Calvo, M., Penkov, K., Spira, A. I., Moreno Candilejo, I., Shore, N. D., Zhang, T., Mellado-Gonzalez, B., Alonso Gordoa, T., Paz-Ares Rodriguez, L., Tarantolo, S. R., et al. (2023). A multi-center, open-label, randomized dose expansion study of pf-06821497, a potent and selective inhibitor of enhancer of zeste homolog 2 (ezh2), in patients with metastatic castration-resistant prostate cancer (mcrpc).
  • Chen et al., (2012) Chen, J., Bittinger, K., Charlson, E. S., Hoffmann, C., Lewis, J., Wu, G. D., Collman, R. G., Bushman, F. D., and Li, H. (2012). Associating microbiome composition with environmental covariates using generalized unifrac distances. Bioinformatics, 28(16):2106–2113.
  • Dasgupta et al., (2023) Dasgupta, S., Acharya, S., Khan, M. A., Pramanik, P., Marbut, S. M., Yunus, F., Galeas, J. N., Singh, S., Singh, A. P., and Dasgupta, S. (2023). Frequent loss of cacna1c, a calcium voltage-gated channel subunit is associated with lung adenocarcinoma progression and poor prognosis. Cancer Research, 83(7_Supplement):3318–3318.
  • Deng and Paul, (2000) Deng, D. and Paul, S. R. (2000). Score tests for zero inflation in generalized linear models. The Canadian Journal of Statistics/La Revue Canadienne de Statistique, pages 563–570.
  • Deng and Paul, (2005) Deng, D. and Paul, S. R. (2005). Score tests for zero-inflation and over-dispersion in generalized linear models. Statistica Sinica, pages 257–276.
  • Dong et al., (2014) Dong, C., Clarke, D. B., Yan, X., Khattak, A., and Huang, B. (2014). Multivariate random-parameters zero-inflated negative binomial regression model: An application to estimate crash frequencies at intersections. Accident Analysis & Prevention, 70:320–329.
  • Efron, (1986) Efron, B. (1986). Double exponential families and their use in generalized linear regression. Journal of the American Statistical Association, 81(395):709–721.
  • Frankel et al., (2017) Frankel, A. E., Coughlin, L. A., Kim, J., Froehlich, T. W., Xie, Y., Frenkel, E. P., and Koh, A. Y. (2017). Metagenomic shotgun sequencing and unbiased metabolomic profiling identify specific human gut microbiota and metabolites associated with immune checkpoint therapy efficacy in melanoma patients. Neoplasia, 19(10):848–855.
  • Garay et al., (2011) Garay, A. M., Hashimoto, E. M., Ortega, E. M., and Lachos, V. H. (2011). On estimation and influence diagnostics for zero-inflated negative binomial regression models. Computational Statistics & Data Analysis, 55(3):1304–1318.
  • Ghosh and Ramamoorthi, (2010) Ghosh, J. and Ramamoorthi, R. (2010). Bayesian nonparametrics. Springer Series in Statistics.
  • Ghosh and Samanta, (2002) Ghosh, J. K. and Samanta, T. (2002). Nonsubjective bayes testing—an overview. Journal of statistical planning and inference, 103(1-2):205–223.
  • Ghosh et al., (2023) Ghosh, R. P., Maity, A. K., Pourahmadi, M., and Mallick, B. K. (2023). Adaptive bayesian variable clustering via structural learning of breast cancer data. Genetic Epidemiology, 47(1):95–104.
  • Halfvarson et al., (2017) Halfvarson, J., Brislawn, C. J., Lamendella, R., Vázquez-Baeza, Y., Walters, W. A., Bramer, L. M., D’amato, M., Bonfiglio, F., McDonald, D., Gonzalez, A., et al. (2017). Dynamics of the human gut microbiome in inflammatory bowel disease. Nature microbiology, 2(5):1–7.
  • Hall, (2000) Hall, D. B. (2000). Zero-inflated poisson and binomial regression with random effects: a case study. Biometrics, 56(4):1030–1039.
  • Hertweck et al., (2023) Hertweck, K. L., Vikramdeo, K. S., Galeas, J. N., Marbut, S. M., Pramanik, P., Yunus, F., Singh, S., Singh, A. P., and Dasgupta, S. (2023). Clinicopathological significance of unraveling mitochondrial pathway alterations in non-small-cell lung cancer. The FASEB Journal, 37(7):e23018.
  • Hua et al., (2019) Hua, L., Polansky, A., and Pramanik, P. (2019). Assessing bivariate tail non-exchangeable dependence. Statistics & Probability Letters, 155:108556.
  • Jeffreys, (1961) Jeffreys, H. (1961). The theory of probability (3rd ed.). OUP Oxford.
  • Jiang et al., (2023) Jiang, R., Zhan, X., and Wang, T. (2023). A flexible zero-inflated poisson-gamma model with application to microbiome sequence count data. Journal of the American Statistical Association, 118(542):792–804.
  • Jiang et al., (2021) Jiang, S., Xiao, G., Koh, A. Y., Kim, J., Li, Q., and Zhan, X. (2021). A bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data. Biostatistics, 22(3):522–540.
  • Jovel et al., (2016) Jovel, J., Patterson, J., Wang, W., Hotte, N., O’Keefe, S., Mitchel, T., Perry, T., Kao, D., Mason, A. L., Madsen, K. L., et al. (2016). Characterization of the gut microbiome using 16s or shotgun metagenomics. Frontiers in microbiology, 7:459.
  • Kakkat et al., (2023) Kakkat, S., Pramanik, P., Singh, S., Singh, A. P., Sarkar, C., and Chakroborty, D. (2023). Cardiovascular complications in patients with prostate cancer: Potential molecular connections. International Journal of Molecular Sciences, 24(8):6984.
  • Karlsson et al., (2013) Karlsson, F. H., Tremaroli, V., Nookaew, I., Bergström, G., Behre, C. J., Fagerberg, B., Nielsen, J., and Bäckhed, F. (2013). Gut metagenome in european women with normal, impaired and diabetic glucose control. Nature, 498(7452):99–103.
  • Kass and Vaidyanathan, (1992) Kass, R. E. and Vaidyanathan, S. K. (1992). Approximate bayes factors and orthogonal parameters, with application to testing equality of two binomial proportions. Journal of the Royal Statistical Society: Series B (Methodological), 54(1):129–144.
  • Kass and Wasserman, (1996) Kass, R. E. and Wasserman, L. (1996). The selection of prior distributions by formal rules. Journal of the American statistical Association, 91(435):1343–1370.
  • Kelly et al., (2015) Kelly, B. J., Gross, R., Bittinger, K., Sherrill-Mix, S., Lewis, J. D., Collman, R. G., Bushman, F. D., and Li, H. (2015). Power and sample-size estimation for microbiome studies using pairwise distances and permanova. Bioinformatics, 31(15):2461–2468.
  • Khan et al., (2023) Khan, M. A., Acharya, S., Anand, S., Sameeta, F., Pramanik, P., Keel, C., Singh, S., Carter, J. E., Dasgupta, S., and Singh, A. P. (2023). Myb exhibits racially disparate expression, clinicopathologic association, and predictive potential for biochemical recurrence in prostate cancer. Iscience, 26(12).
  • Kinross et al., (2011) Kinross, J. M., Darzi, A. W., and Nicholson, J. K. (2011). Gut microbiome-host interactions in health and disease. Genome medicine, 3:1–12.
  • La Rosa et al., (2014) La Rosa, P. S., Warner, B. B., Zhou, Y., Weinstock, G. M., Sodergren, E., Hall-Moore, C. M., Stevens, H. J., Bennett Jr, W. E., Shaikh, N., Linneman, L. A., et al. (2014). Patterned progression of bacterial populations in the premature infant gut. Proceedings of the National Academy of Sciences, 111(34):12522–12527.
  • La Rosa et al., (2015) La Rosa, P. S., Zhou, Y., Sodergren, E., Weinstock, G., and Shannon, W. D. (2015). Hypothesis testing of metagenomic data. In Metagenomics for microbiology, pages 81–96. Elsevier.
  • Lambert, (1992) Lambert, D. (1992). Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics, 34(1):1–14.
  • Lawless, (1987) Lawless, J. F. (1987). Negative binomial and mixed poisson regression. The Canadian Journal of Statistics/La Revue Canadienne de Statistique, pages 209–225.
  • Lüdecke et al., (2021) Lüdecke, D., Ben-Shachar, M. S., Patil, I., Waggoner, P., and Makowski, D. (2021). Performance: An R package for assessment, comparison and testing of statistical models. Journal of Open Source Software, 6(60):3139.
  • Leamy et al., (2014) Leamy, L. J., Kelly, S. A., Nietfeldt, J., Legge, R. M., Ma, F., Hua, K., Sinha, R., Peterson, D. A., Walter, J., Benson, A. K., et al. (2014). Host genetics and diet, but not immunoglobulin a expression, converge to shape compositional features of the gut microbiome in an advanced intercross population of mice. Genome biology, 15:1–20.
  • Li et al., (2021) Li, Q., Zhang, M., Xie, Y., and Xiao, G. (2021). Bayesian modeling of spatial molecular profiling data via gaussian process. Bioinformatics, 37(22):4129–4136.
  • Li et al., (2018) Li, Z., Lee, K., Karagas, M. R., Madan, J. C., Hoen, A. G., O’malley, A. J., and Li, H. (2018). Conditional regression based on a multivariate zero-inflated logistic-normal model for microbiome relative abundance data. Statistics in biosciences, 10:587–608.
  • Liang et al., (2014) Liang, X., Li, H., Tian, G., and Li, S. (2014). Dynamic microbe and molecule networks in a mouse model of colitis-associated colorectal cancer. Scientific reports, 4(1):4985.
  • Love et al., (2014) Love, M. I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and dispersion for rna-seq data with deseq2. Genome biology, 15(12):1–21.
  • Maier et al., (2018) Maier, L., Pruteanu, M., Kuhn, M., Zeller, G., Telzerow, A., Anderson, E. E., Brochado, A. R., Fernandez, K. C., Dose, H., Mori, H., et al. (2018). Extensive impact of non-antibiotic drugs on human gut bacteria. Nature, 555(7698):623–628.
  • Maity, (2022) Maity, A. (2022). sahpm: Variable Selection using Simulated Annealing. R package version 1.0.1.
  • (52) Maity, A., Chakraborty, A., Bhattacharya, A., Carroll, R., Mallick, B. K., and Maity, M. A. (2019a). Package ‘intsurvbin’.
  • (53) Maity, A., Sharma, J., Sarkar, A., More, A. K., Pal, R. K., Nagane, V. P., and Maity, A. (2018a). Salicylic acid mediated multi-pronged strategy to combat bacterial blight disease (xanthomonas axonopodis pv. punicae) in pomegranate. European Journal of Plant Pathology, 150:923–937.
  • Maity, (2016) Maity, A. K. (2016). Bayesian variable selection in linear and non-linear models. Northern Illinois University.
  • Maity and Basu, (2023) Maity, A. K. and Basu, S. (2023). Highest posterior model computation and variable selection via simulated annealing. The New England Journal of Statistics in Data Science, 1(2):200–207.
  • (56) Maity, A. K., Basu, S., and Ghosh, S. (2021a). Bayesian criterion-based variable selection. Journal of the Royal Statistical Society Series C: Applied Statistics, 70(4):835–857.
  • Maity et al., (2020) Maity, A. K., Bhattacharya, A., Mallick, B. K., and Baladandayuthapani, V. (2020). Bayesian data integration and variable selection for pan-cancer survival prediction using protein expression data. Biometrics, 76(1):316–325.
  • (58) Maity, A. K., Carroll, R. J., and Mallick, B. K. (2019b). Integration of survival and binary data for variable selection and prediction: a bayesian approach. Journal of the Royal Statistical Society Series C: Applied Statistics, 68(5):1577–1595.
  • (59) Maity, A. K., Chan Lee, S., K. Mallick, B., Bhattacharjee, S., and K. Biswas, N. (2021b). semmcmc: Bayesian Structural Equation Modeling in Multiple Omics Data Integration. R package version 0.0.6.
  • Maity and Dey, (2018) Maity, A. K. and Dey, J. (2018). Power analysis of collapsed ordered categories with application to cancer data. Calcutta Statistical Association Bulletin, 70(2):87–95.
  • (61) Maity, A. K., Lee, S. C., Hu, L., Bell-pederson, D., Mallick, B. K., and Sarkar, T. R. (2021c). Circadian gene selection for time-to-event phenotype by integrating cnv and rnaseq data. Chemometrics and Intelligent Laboratory Systems, 212:104276.
  • Maity and Paul, (2022) Maity, A. K. and Paul, E. (2022). Jeffreys prior for negative binomial and zero inflated negative binomial distributions. Sankhya A, pages 1–15.
  • Maity and Paul, (2023) Maity, A. K. and Paul, E. (2023). Jeffreys prior for negative binomial and zero inflated negative binomial distributions. Sankhya A, 85(1):999–1013.
  • (64) Maity, A. K., Pradhan, V., and Das, U. (2018b). Bias reduction in logistic regression with missing responses when the missing data mechanism is nonignorable. The American Statistician.
  • Merkle and You, (2020) Merkle, E. and You, D. (2020). nonnest2: Tests of Non-Nested Models. R package version 0.5-5.
  • (66) Minar, S. J. (2018a). Evaluating the effectiveness of the united nations organizations: the limits of theories and need for a new analytical framework. International Journal of Advanced Research, 6(7):457–462.
  • (67) Minar, S. J. (2018b). Grand strategy and foreign policy: How grand strategy can aid bangladesh’s foreign policy rethinking. Journal of Social Studies, 4(1):20–27.
  • Minar, (2019) Minar, S. J. (2019). Tatmadaw’s crackdown on the rohingyas: A swot analysis. Journal of Social Studies, 5(1):1–5.
  • Nam et al., (2022) Nam, S. J., Kim, S., and Ng, H. K. T. (2022). Bayesian and frequentist approaches on estimation and testing for a zero-inflated binomial distribution. Hacettepe Journal of Mathematics and Statistics, 37(3):1–23.
  • O’Hara and Kotze, (2010) O’Hara, R. and Kotze, J. (2010). Do not log-transform count data. Nature Precedings, pages 1–1.
  • Paul et al., (2018) Paul, E., Maity, A. K., and Maiti, R. (2018). Bayesian comparative study on binary time series. Journal of Statistical Computation and Simulation, 88(14):2811–2826.
  • Piterbarg, (1996) Piterbarg, V. I. (1996). Asymptotic methods in the theory of Gaussian processes and fields, volume 148. American Mathematical Soc.
  • Polansky and Pramanik, (2021) Polansky, A. M. and Pramanik, P. (2021). A motif building process for simulating random networks. Computational Statistics & Data Analysis, 162:107263.
  • Pramanik, (2016) Pramanik, P. (2016). Tail non-exchangeability. Northern Illinois University.
  • Pramanik, (2020) Pramanik, P. (2020). Optimization of market stochastic dynamics. In SN Operations Research Forum, volume 1, page 31. Springer.
  • (76) Pramanik, P. (2021a). Consensus as a nash equilibrium of a stochastic differential game. arXiv preprint arXiv:2107.05183.
  • (77) Pramanik, P. (2021b). Effects of water currents on fish migration through a feynman-type path integral approach under 8/3 liouville-like quantum gravity surfaces. Theory in Biosciences, 140(2):205–223.
  • (78) Pramanik, P. (2021c). Optimization of dynamic objective functions using path integrals. PhD thesis, Northern Illinois University.
  • (79) Pramanik, P. (2022a). On lock-down control of a pandemic model. arXiv preprint arXiv:2206.04248.
  • (80) Pramanik, P. (2022b). Stochastic control of a sir model with non-linear incidence rate through euclidean path integral. arXiv preprint arXiv:2209.13733.
  • (81) Pramanik, P. (2023a). Optimal lock-down intensity: A stochastic pandemic control approach of path integral. Computational and Mathematical Biophysics, 11(1):20230110.
  • (82) Pramanik, P. (2023b). Path integral control in infectious disease modeling. arXiv preprint arXiv:2311.02113.
  • (83) Pramanik, P. (2023c). Path integral control of a stochastic multi-risk sir pandemic model. Theory in Biosciences, 142(2):107–142.
  • Pramanik and Polansky, (2020) Pramanik, P. and Polansky, A. M. (2020). Motivation to run in one-day cricket. arXiv preprint arXiv:2001.11099.
  • Pramanik and Polansky, (2021) Pramanik, P. and Polansky, A. M. (2021). Optimal estimation of brownian penalized regression coefficients. arXiv preprint arXiv:2107.02291.
  • (86) Pramanik, P. and Polansky, A. M. (2023a). Optimization of a dynamic profit function using euclidean path integral. SN Business & Economics, 4(1):8.
  • (87) Pramanik, P. and Polansky, A. M. (2023b). Scoring a goal optimally in a soccer game under liouville-like quantum gravity action. In Operations Research Forum, volume 4, page 66. Springer.
  • Qin et al., (2010) Qin, J., Li, R., Raes, J., Arumugam, M., Burgdorf, K. S., Manichanh, C., Nielsen, T., Pons, N., Levenez, F., Yamada, T., et al. (2010). A human gut microbial gene catalogue established by metagenomic sequencing. nature, 464(7285):59–65.
  • Qin et al., (2014) Qin, N., Yang, F., Li, A., Prifti, E., Chen, Y., Shao, L., Guo, J., Le Chatelier, E., Yao, J., Wu, L., et al. (2014). Alterations of the human gut microbiome in liver cirrhosis. Nature, 513(7516):59–64.
  • R Core Team, (2021) R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
  • Ridout et al., (2001) Ridout, M., Hinde, J., and Demétrio, C. G. (2001). A score test for testing a zero-inflated poisson regression model against zero-inflated negative binomial alternatives. Biometrics, 57(1):219–223.
  • Robinson et al., (2010) Robinson, M. D., McCarthy, D. J., and Smyth, G. K. (2010). edger: a bioconductor package for differential expression analysis of digital gene expression data. bioinformatics, 26(1):139–140.
  • Romero et al., (2014) Romero, R., Hassan, S. S., Gajer, P., Tarca, A. L., Fadrosh, D. W., Nikita, L., Galuppi, M., Lamont, R. F., Chaemsaithong, P., Miranda, J., et al. (2014). The composition and stability of the vaginal microbiota of normal pregnant women is different from that of non-pregnant women. Microbiome, 2(1):1–19.
  • (94) Roy, M., Das, S., and Protity, A. T. (2023a). Obeseye: Interpretable diet recommender for obesity management using machine learning and explainable ai. arXiv preprint arXiv:2308.02796.
  • (95) Roy, M., Minar, S. J., Dhar, P., and Faruq, A. (2023b). Machine learning applications in healthcare: The state of knowledge and future directions. arXiv preprint arXiv:2307.14067.
  • (96) Roy, M., Protity, A. T., Das, S., and Dhar, P. (2023c). Prevalence and major risk factors of non-communicable diseases: a machine learning based cross-sectional study. EUREKA: Health Sciences, (3):28–45.
  • Roy Sarkar et al., (2019) Roy Sarkar, T., Maity, A. K., Niu, Y., and Mallick, B. K. (2019). Multiple omics data integration to identify long noncoding rna responsible for breast cancer–related mortality. Cancer Informatics, 18:1176935119871933.
  • Schweizer et al., (2022) Schweizer, M., Penkov, K., Tolcher, A., Choudhury, A., Doronin, V., Aljumaily, R., Calvo, E., Frank, R., Hamm, J., Garcia, V. M., et al. (2022). 488p phase i trial of pf-06821497, a potent and selective inhibitor of enhancer of zeste homolog 2 (ezh2), in follicular lymphoma (fl), small cell lung cancer (sclc) and castration-resistant prostate cancer (crpc). Annals of Oncology, 33:S763–S764.
  • Segata et al., (2012) Segata, N., Waldron, L., Ballarini, A., Narasimhan, V., Jousson, O., and Huttenhower, C. (2012). Metagenomic microbial community profiling using unique clade-specific marker genes. Nature methods, 9(8):811–814.
  • Sender et al., (2016) Sender, R., Fuchs, S., and Milo, R. (2016). Revised estimates for the number of human and bacteria cells in the body. PLoS biology, 14(8):e1002533.
  • Sharpton et al., (2017) Sharpton, T., Lyalina, S., Luong, J., Pham, J., Deal, E. M., Armour, C., Gaulke, C., Sanjabi, S., and Pollard, K. S. (2017). Development of inflammatory bowel disease is linked to a longitudinal restructuring of the gut metagenome in mice. Msystems, 2(5):10–1128.
  • Sommerhalder et al., (2023) Sommerhalder, D., Hamilton, E. P., Mukohara, T., Yonemori, K., Mita, M. M., Yamashita, T., Zheng, J., Liu, L., Maity, A. K., Homji Mishra, N., et al. (2023). First-in-human phase 1 dose escalation study of the kat6 inhibitor pf-07248144 in patients with advanced solid tumors.
  • Stewart et al., (2017) Stewart, C. J., Embleton, N. D., Marrs, E. C., Smith, D. P., Fofanova, T., Nelson, A., Skeath, T., Perry, J. D., Petrosino, J. F., Berrington, J. E., et al. (2017). Longitudinal development of the gut microbiome and metabolome in preterm neonates with late onset sepsis and healthy controls. Microbiome, 5(1):1–11.
  • Ursell et al., (2012) Ursell, L. K., Metcalf, J. L., Parfrey, L. W., and Knight, R. (2012). Defining the human microbiome. Nutrition reviews, 70(suppl_1):S38–S44.
  • Van den Broek, (1995) Van den Broek, J. (1995). A score test for zero inflation in a poisson distribution. Biometrics, pages 738–743.
  • Vatanen et al., (2016) Vatanen, T., Kostic, A. D., d’Hennezel, E., Siljander, H., Franzosa, E. A., Yassour, M., Kolde, R., Vlamakis, H., Arthur, T. D., Hämäläinen, A.-M., et al. (2016). Variation in microbiome lps immunogenicity contributes to autoimmunity in humans. Cell, 165(4):842–853.
  • Vikramdeo et al., (2023) Vikramdeo, K. S., Anand, S., Sudan, S. K., Pramanik, P., Singh, S., Godwin, A. K., Singh, A. P., and Dasgupta, S. (2023). Profiling mitochondrial dna mutations in tumors and circulating extracellular vesicles of triple-negative breast cancer patients for potential biomarker development. FASEB BioAdvances, 5(10):412.
  • Vuong, (1989) Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: journal of the Econometric Society, pages 307–333.
  • Wang et al., (2015) Wang, J., Kalyan, S., Steck, N., Turner, L. M., Harr, B., Künzel, S., Vallier, M., Häsler, R., Franke, A., Oberg, H.-H., et al. (2015). Analysis of intestinal microbiota in hybrid house mice reveals evolutionary divergence in a vertebrate hologenome. Nature communications, 6(1):6440.
  • Wasserman, (2000) Wasserman, L. (2000). Bayesian model selection and model averaging. Journal of mathematical psychology, 44(1):92–107.
  • White and Bennetts, (1996) White, G. C. and Bennetts, R. E. (1996). Analysis of frequency count data using the negative binomial distribution. Ecology, 77(8):2549–2557.
  • Wu et al., (2016) Wu, C., Chen, J., Kim, J., and Pan, W. (2016). An adaptive association test for microbiome data. Genome medicine, 8:1–12.
  • Yang et al., (2018) Yang, X., Qian, Y., Xu, S., Song, Y., and Xiao, Q. (2018). Longitudinal analysis of fecal microbiome and pathologic processes in a rotenone induced mice model of parkinson’s disease. Frontiers in aging neuroscience, 9:441.
  • Zhang et al., (2017) Zhang, X., Mallick, H., Tang, Z., Zhang, L., Cui, X., Benson, A. K., and Yi, N. (2017). Negative binomial mixed models for analyzing microbiome count data. BMC bioinformatics, 18:1–10.
  • Zhang and Yi, (2020) Zhang, X. and Yi, N. (2020). Fast zero-inflated negative binomial mixed modeling approach for analyzing longitudinal metagenomics data. Bioinformatics, 36(8):2345–2351.
  • Zhou and Tu, (2000) Zhou, X.-H. and Tu, W. (2000). Confidence intervals for the mean of diagnostic test charge data containing zeros. Biometrics, 56(4):1118–1125.
  • Zhu et al., (2018) Zhu, W., Winter, M. G., Byndloss, M. X., Spiga, L., Duerkop, B. A., Hughes, E. R., Büttner, L., de Lima Romão, E., Behrendt, C. L., Lopez, C. A., et al. (2018). Precision editing of the gut microbiota ameliorates colitis. Nature, 553(7687):208–211.