This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Asymptotics of numerical integration for two-level mixed models

Blair Bilodeaulabel=e1][email protected] [    Alex Stringerlabel=e2][email protected] [    Yanbo Tang*label=e3][email protected] [ University of Toronto presep=, ]e1 University of Waterloo presep=, ]e2 Imperial College London presep=, ]e3
Abstract

We study mixed models with a single grouping factor, where inference about unknown parameters requires optimizing a marginal likelihood defined by an intractable integral. Low-dimensional numerical integration techniques are regularly used to approximate these integrals, with inferences about parameters based on the resulting approximate marginal likelihood. For a generic class of mixed models that satisfy explicit regularity conditions, we derive the stochastic relative error rate incurred for both the likelihood and maximum likelihood estimator when adaptive numerical integration is used to approximate the marginal likelihood. We then specialize the analysis to well-specified generalized linear mixed models having exponential family response and multivariate Gaussian random effects, verifying that the regularity conditions hold, and hence that the convergence rates apply. We also prove that for models with likelihoods satisfying very weak concentration conditions that the maximum likelihood estimators from non-adaptive numerical integration approximations of the marginal likelihood are not consistent, further motivating adaptive numerical integration as the preferred tool for inference in mixed models. Code to reproduce the simulations in this paper is provided at https://github.com/awstringer1/aq-theory-paper-code.

Adaptive quadrature,
approximate inference,
generalized linear models,
keywords:
\IfEq-1-1|~π(y)π(y)-1|\IfEq-10|~π(y)π(y)-1|\IfEq-11|~π(y)π(y)-1|\IfEq-12|~π(y)π(y)-1|\IfEq-13|~π(y)π(y)-1|\IfEq-14|~π(y)π(y)-1|>κ}>γ.
Proof.
Observethat~π(y)π(y)=∑z∈Qω(z)π(z|y),whereπ(z|y)istheposteriordensityofuevaluatedatz.Wecanthereforewrite
ω¯×max𝐳𝒬π(𝐳|𝐲)π~(𝐲)π(𝐲)\IfEq11|𝒬|\IfEq10|𝒬|\IfEq11|𝒬|\IfEq12|𝒬|\IfEq13|𝒬|\IfEq14|𝒬|ω¯×max𝐳𝒬π(𝐳|𝐲),whereω¯=min𝐳𝒬ω(𝐳)andω¯=max𝐳𝒬ω(𝐳).Therearetwocasestoconsider.Supposefirstthat𝐮𝒬.ThenbyProof.andtheassumptionofthetheorem,π~(𝐲)/π(𝐲)p.Wethereforemaychooseϵ>0,γ(0,1)suchthattheremustexistnsuchthatforeveryN>n,\underline{\omega}\times\operatorname*{\mathrm{max}\vphantom{\mathrm{infsup}}}_{\bm{z}\in\mathcal{Q}}\pi(\bm{z}|\bm{y})\leq\frac{\widetilde{\pi}(\bm{y})}{\pi(\bm{y})}\leq\IfEq{-1}{-1}{\left|{{{{\mathcal{Q}}}}}\right|}{\IfEq{-1}{0}{|{{{{\mathcal{Q}}}}}|}{\IfEq{-1}{1}{\bigl{|}{{{{\mathcal{Q}}}}}\bigr{|}}{\IfEq{-1}{2}{\Bigl{|}{{{{\mathcal{Q}}}}}\Bigr{|}}{\IfEq{-1}{3}{\biggl{|}{{{{\mathcal{Q}}}}}\biggr{|}}{\IfEq{-1}{4}{\Biggl{|}{{{{\mathcal{Q}}}}}\Biggr{|}}{}}}}}\overline{\omega}\times\operatorname*{\mathrm{max}\vphantom{\mathrm{infsup}}}_{\bm{z}\in\mathcal{Q}}\pi(\bm{z}|\bm{y}),\end{equation}where\underline{\omega}=\operatorname*{\mathrm{min}\vphantom{\mathrm{infsup}}}_{\bm{z}\in\mathcal{Q}}\omega(\bm{z})and\overline{\omega}=\operatorname*{\mathrm{max}\vphantom{\mathrm{infsup}}}_{\bm{z}\in\mathcal{Q}}\omega(\bm{z}).\par Therearetwocasestoconsider.Supposefirstthat\bm{u}_{*}\in\mathcal{Q}.Thenby\lx@cref{creftype~refnum}{eqn:fracbound}andtheassumptionofthetheorem,\widetilde{\pi}(\bm{y})/\pi(\bm{y})\overset{p}{\longrightarrow}\infty.Wethereforemaychoose\epsilon>0,\gamma\in(0,1)suchthattheremustexistn\in\mathbb{N}suchthatforeveryN>n,}
PN∗{~π(y)π(y)>ε+1}>γ.Setκ=ϵ>0andnotethat~π(y)>π(y)eventuallytoyieldtheresult.Supposenextthatu∗∉Q.Thenby

Proof.,~π(y)/π(y)p⟶0.Chooseϵ,γ∈(0,1)suchthattheremustexistn∈NsuchthatforeveryN>n,

γ<PN{π~(𝐲)π(𝐲)<ε}=PN{1π~(𝐲)π(𝐲)>1ε}=PN{\IfEq11|π~(𝐲)π(𝐲)1|\IfEq10|π~(𝐲)π(𝐲)1|\IfEq11|π~(𝐲)π(𝐲)1|\IfEq12|π~(𝐲)π(𝐲)1|\IfEq13|π~(𝐲)π(𝐲)1|\IfEq14|π~(𝐲)π(𝐲)1|>1ε}wherethelaststepusesthatπ~(𝐲)<π(𝐲)eventually.Setκ=1ϵ>0toyieldtheresult.ThemostobviouswaytoguaranteetheconditionsofLABEL:fact:nonconvergenceisforthemodeltosatisfyaBernsteinvonMisestheorem(vandervaart, Section 10.2).Foraverybroadclassofmisspecifiedmodels,misspecshowthataBernsteinvonMisestyperesultholds,suggestingthatLABEL:fact:nonconvergenceiswidelyapplicableandthatitsconclusionsapplytomanymodelsusedinpractice.Returningfocustothemixedmodelswhicharethesubjectofthepresentpaper,Corollary 1specializesLABEL:fact:nonconvergencetomixedmodelsoftheformgiveninLABEL:eqn:glmmdefinition,with``true′′parametervalue𝜽(seeLABEL:sec:uniform-assumptionsfortheprecisedefinition).Corollary 11CorollaryCorollaryCorollariesCorollaries1Corollary 1Corollary 1.𝐶𝑜𝑛𝑠𝑖𝑑𝑒𝑟𝑡ℎ𝑒𝑚𝑜𝑑𝑒𝑙𝑔𝑖𝑣𝑒𝑛𝑏𝑦LABEL:eqn:glmmdefinition.𝑈𝑛𝑑𝑒𝑟LABEL:assn:kderiv,LABEL:assn:hessian,LABEL:assn:limsup,LABEL:assn:limsup-out,LABEL:assn:consistency𝑎𝑛𝑑LABEL:assn:prior𝑖𝑛LABEL:sec:uniform-assumptions𝑎𝑛𝑑𝑓𝑜𝑟𝜽𝑑𝑒𝑓𝑖𝑛𝑒𝑑𝑡ℎ𝑒𝑟𝑒𝑖𝑛,𝑡ℎ𝑒𝑟𝑒𝑒𝑥𝑖𝑠𝑡𝑠κ>0andγ(0,1)𝑠𝑢𝑐ℎ𝑡ℎ𝑎𝑡limNPN{\IfEq11|π~(𝐲;𝜽)π(𝐲;𝜽)1|\IfEq10|π~(𝐲;𝜽)π(𝐲;𝜽)1|\IfEq11|π~(𝐲;𝜽)π(𝐲;𝜽)1|\IfEq12|π~(𝐲;𝜽)π(𝐲;𝜽)1|\IfEq13|π~(𝐲;𝜽)π(𝐲;𝜽)1|\IfEq14|π~(𝐲;𝜽)π(𝐲;𝜽)1|>κ}>γ.\begin{aligned} \gamma<P_{N}^{*}\left\{\frac{\widetilde{\pi}(\bm{y})}{\pi(\bm{y})}<\varepsilon\right\}=P_{N}^{*}\left\{1-\frac{\widetilde{\pi}(\bm{y})}{\pi(\bm{y})}>1-\varepsilon\right\}=P_{N}^{*}\left\{\IfEq{-1}{-1}{\left|{{{{\frac{\widetilde{\pi}(\bm{y})}{\pi(\bm{y})}-1}}}}\right|}{\IfEq{-1}{0}{|{{{{\frac{\widetilde{\pi}(\bm{y})}{\pi(\bm{y})}-1}}}}|}{\IfEq{-1}{1}{\bigl{|}{{{{\frac{\widetilde{\pi}(\bm{y})}{\pi(\bm{y})}-1}}}}\bigr{|}}{\IfEq{-1}{2}{\Bigl{|}{{{{\frac{\widetilde{\pi}(\bm{y})}{\pi(\bm{y})}-1}}}}\Bigr{|}}{\IfEq{-1}{3}{\biggl{|}{{{{\frac{\widetilde{\pi}(\bm{y})}{\pi(\bm{y})}-1}}}}\biggr{|}}{\IfEq{-1}{4}{\Biggl{|}{{{{\frac{\widetilde{\pi}(\bm{y})}{\pi(\bm{y})}-1}}}}\Biggr{|}}{}}}}}>1-\varepsilon\right\}\end{aligned}wherethelaststepusesthat\widetilde{\pi}(\bm{y})<\pi(\bm{y})eventually.Set\kappa=1-\epsilon>0toyieldtheresult.\qed\end@proof\par Themostobviouswaytoguaranteetheconditionsof\lx@cref{creftype~refnum}{fact:nonconvergence}isforthemodeltosatisfyaBernstein-vonMisestheorem\cite[citep]{(\@@bibref{AuthorsPhrase1Year}{vandervaart}{\@@citephrase{, }}{}, Section 10.2)}.Foraverybroadclassofmisspecifiedmodels,\cite[citet]{\@@bibref{Authors Phrase1YearPhrase2}{misspec}{\@@citephrase{(}}{\@@citephrase{)}}}showthataBernstein-vonMises-typeresultholds,suggestingthat\lx@cref{creftype~refnum}{fact:nonconvergence}iswidelyapplicableandthatitsconclusionsapplytomanymodelsusedinpractice.\par Returningfocustothemixedmodelswhicharethesubjectofthepresentpaper,\lx@cref{creftype~refnum}{cor:mixedmodelsnonconverge}specializes\lx@cref{creftype~refnum}{fact:nonconvergence}tomixedmodelsoftheformgivenin\lx@cref{creftype~refnum}{eqn:glmmdefinition},with``true^{\prime\prime}parametervalue\bm{\theta}_{*}(see\lx@cref{creftype~refnum}{sec:uniform-assumptions}fortheprecisedefinition).\par\begin{corollary}Considerthemodelgivenby\lx@cref{creftype~refnum}{eqn:glmmdefinition}.Under\lx@cref{creftypeplural~refnum}{assn:kderiv},\lx@cref{refnum}{assn:hessian},\lx@cref{refnum}{assn:limsup},\lx@cref{refnum}{assn:limsup-out},\lx@cref{refnum}{assn:consistency}and~\lx@cref{refnum}{assn:prior}in\lx@cref{creftype~refnum}{sec:uniform-assumptions}andfor\bm{\theta}_{*}definedtherein,thereexists\kappa>0and\gamma\in(0,1)suchthat\end{equation}\operatorname*{\mathrm{lim}\vphantom{\mathrm{infsup}}}_{N\to\infty}P_{N}^{*}\left\{\IfEq{-1}{-1}{\left|{{{{\frac{\widetilde{\pi}(\bm{y};\bm{\theta}_{*})}{\pi(\bm{y};\bm{\theta}_{*})}-1}}}}\right|}{\IfEq{-1}{0}{|{{{{\frac{\widetilde{\pi}(\bm{y};\bm{\theta}_{*})}{\pi(\bm{y};\bm{\theta}_{*})}-1}}}}|}{\IfEq{-1}{1}{\bigl{|}{{{{\frac{\widetilde{\pi}(\bm{y};\bm{\theta}_{*})}{\pi(\bm{y};\bm{\theta}_{*})}-1}}}}\bigr{|}}{\IfEq{-1}{2}{\Bigl{|}{{{{\frac{\widetilde{\pi}(\bm{y};\bm{\theta}_{*})}{\pi(\bm{y};\bm{\theta}_{*})}-1}}}}\Bigr{|}}{\IfEq{-1}{3}{\biggl{|}{{{{\frac{\widetilde{\pi}(\bm{y};\bm{\theta}_{*})}{\pi(\bm{y};\bm{\theta}_{*})}-1}}}}\biggr{|}}{\IfEq{-1}{4}{\Biggl{|}{{{{\frac{\widetilde{\pi}(\bm{y};\bm{\theta}_{*})}{\pi(\bm{y};\bm{\theta}_{*})}-1}}}}\Biggr{|}}{}}}}}>\kappa\right\}>\gamma.}}
Proof.
Restrictingattentiontoθ=θ∗reducestheproblemtoexactlythatconsideredby

bilodeau2021stochastic,andourLABEL:assn:kderiv,LABEL:assn:hessian,LABEL:assn:limsup,LABEL:assn:consistencyandLABEL:assn:priorreducetotheirAssumptions1--5.IntheirRemark5theyshowthattheseassumptionsimplythattheBernstein-vonMisestheoremholdsforπ(y,u;θ∗);thisinturnimpliestheconditionsofLABEL:fact:nonconvergence.∎

While

Corollary 1onlyappliestothesingleparametervalueθ∗andonlystatesthattheerrorcannotreachzero(asopposedto,say,divergingto∞),itisnonethelesssufficienttoruleoutinferencesbasedon~π(y;θ)formostmixedmodelsusedinpractice.Inmostcasestheerroroftheapproximationwilldependonθ,thereforeitisnotguaranteedthattheapproximatedintegratedlikelihoodmaintainsitsshapelocallyaroundthemodeandconsequentlyconfidenceintervalsconstructedusingthelocalcurvatureorthelikelihooddropmaybeunreliable.

3.2 ApproximationErrorforAdaptiveQuadrature

Likelihoodapproximationsbasedon

adaptivequadraturedoconverge.LABEL:fact:likelihoodquantifiestherateofconvergenceforadaptivequadratureapproximationstothemarginallikelihoodinmixedmodels.Thisintermediatetechnicalresultisrequiredtoproveconvergenceoftheapproximatemaximumlikelihoodestimator(LABEL:fact:consistency).AsimilarresultisassumedbyaghqmlealthoughtheydonotspecifytheregionofΘinwhichtheuniformconvergenceoccurs,andastrongerresultaboutuniformconvergenceofderivativesoftheapproximatelog-likelihoodisrequiredbyapproximatelikelihood.Ourproofisself-contained,andmakesuseofsuitablyupgradedtechnicallemmasrecentlyprovidedbybilodeau2021stochastic.Aslightlooseningoftheusualerrorratescomparedtoresultsobtainedinaghqmleandapproximatelikelihoodisrequiredforuniformityoftheapproximationerrortohold.Giventhattheuniformityisassumedandnotshowinthesepreviousworks,itispossiblethattheirratesaretoooptimisticforthemixedmodelsconsideredatpresent.

Wedefineζi=ni-αfor0<α<1/4,andζN=(mini=1,…,mni)α.Weletthenumberofgroupsm=nminqforsomeq>0,sothatasN→∞,m→∞aswell.TheradiusζNwilldefinetheshrinkingregionintheparameterspaceinwhichallourstatementsaboutuniformconvergencehold;thepreciserateofshrinkageαischosentobalancetheconcentrationofthelikelihoodwiththeconvergencetozerooftheintegrationerror.Wealsodefineafixedneighbourhoodofarbitraryradiusδ>0,andapointθ∗∈Θaroundwhichthelikelihoodconcentrates.Thismaybeintuitivelythoughtofasa``true′′valueofθ,andunderweakconditionswillbethepointthatmaximizestheexpectedlog-likelihood;weemphasizethatatnopointdoweassumethemodeliscorrectlyspecifiedinthesensethatPN∗isrecoveredbyπ(θ;y)