Asymptotics of numerical integration for two-level mixed models
Abstract
We study mixed models with a single grouping factor,
where inference about unknown parameters
requires optimizing a marginal likelihood defined by an intractable integral.
Low-dimensional numerical integration techniques are regularly used to approximate these integrals,
with inferences about parameters based on the resulting approximate marginal likelihood.
For a generic class of mixed models that satisfy explicit regularity conditions,
we derive the stochastic relative error rate incurred for both the likelihood and maximum likelihood estimator when adaptive numerical integration is used to approximate the marginal likelihood.
We then specialize the analysis to well-specified generalized linear mixed models having exponential family response and multivariate Gaussian random effects,
verifying that the regularity conditions hold, and hence that the convergence rates apply.
We also prove that for models with likelihoods satisfying very weak concentration conditions
that the maximum likelihood estimators from non-adaptive numerical integration approximations of the marginal likelihood are not consistent,
further motivating adaptive numerical integration as the preferred tool for inference in mixed models.
Code to reproduce the simulations in this paper is provided at https://github.com/awstringer1/aq-theory-paper-code.
Adaptive quadrature,
approximate inference,
generalized linear models,
keywords:
\IfEq-1-1|~π(y)π(y)-1|\IfEq-10|~π(y)π(y)-1|\IfEq-11|~π(y)π(y)-1|\IfEq-12|~π(y)π(y)-1|\IfEq-13|~π(y)π(y)-1|\IfEq-14|~π(y)π(y)-1|>κ}>γ.
Proof.
Observethat~π(y)π(y)=∑z∈Qω(z)π(z|y),whereπ(z|y)istheposteriordensityofuevaluatedatz.Wecanthereforewrite
|
|
|
PN∗{~π(y)π(y)>ε+1}>γ.Setκ=ϵ>0andnotethat~π(y)>π(y)eventuallytoyieldtheresult.Supposenextthatu∗∉Q.Thenby
Proof.,~π(y)/π(y)p⟶0.Chooseϵ,γ∈(0,1)suchthattheremustexistn∈NsuchthatforeveryN>n,
|
|
|
Proof.
Restrictingattentiontoθ=θ∗reducestheproblemtoexactlythatconsideredby
bilodeau2021stochastic,andourLABEL:assn:kderiv,LABEL:assn:hessian,LABEL:assn:limsup,LABEL:assn:consistencyandLABEL:assn:priorreducetotheirAssumptions1--5.IntheirRemark5theyshowthattheseassumptionsimplythattheBernstein-vonMisestheoremholdsforπ(y,u;θ∗);thisinturnimpliestheconditionsofLABEL:fact:nonconvergence.∎
While
Corollary 1onlyappliestothesingleparametervalueθ∗andonlystatesthattheerrorcannotreachzero(asopposedto,say,divergingto∞),itisnonethelesssufficienttoruleoutinferencesbasedon~π(y;θ)formostmixedmodelsusedinpractice.Inmostcasestheerroroftheapproximationwilldependonθ,thereforeitisnotguaranteedthattheapproximatedintegratedlikelihoodmaintainsitsshapelocallyaroundthemodeandconsequentlyconfidenceintervalsconstructedusingthelocalcurvatureorthelikelihooddropmaybeunreliable.
3.2 ApproximationErrorforAdaptiveQuadrature
Likelihoodapproximationsbasedon
adaptivequadraturedoconverge.LABEL:fact:likelihoodquantifiestherateofconvergenceforadaptivequadratureapproximationstothemarginallikelihoodinmixedmodels.Thisintermediatetechnicalresultisrequiredtoproveconvergenceoftheapproximatemaximumlikelihoodestimator(LABEL:fact:consistency).AsimilarresultisassumedbyaghqmlealthoughtheydonotspecifytheregionofΘinwhichtheuniformconvergenceoccurs,andastrongerresultaboutuniformconvergenceofderivativesoftheapproximatelog-likelihoodisrequiredbyapproximatelikelihood.Ourproofisself-contained,andmakesuseofsuitablyupgradedtechnicallemmasrecentlyprovidedbybilodeau2021stochastic.Aslightlooseningoftheusualerrorratescomparedtoresultsobtainedinaghqmleandapproximatelikelihoodisrequiredforuniformityoftheapproximationerrortohold.Giventhattheuniformityisassumedandnotshowinthesepreviousworks,itispossiblethattheirratesaretoooptimisticforthemixedmodelsconsideredatpresent.
Wedefineζi=ni-αfor0<α<1/4,andζN=(mini=1,…,mni)α.Weletthenumberofgroupsm=nminqforsomeq>0,sothatasN→∞,m→∞aswell.TheradiusζNwilldefinetheshrinkingregionintheparameterspaceinwhichallourstatementsaboutuniformconvergencehold;thepreciserateofshrinkageαischosentobalancetheconcentrationofthelikelihoodwiththeconvergencetozerooftheintegrationerror.Wealsodefineafixedneighbourhoodofarbitraryradiusδ>0,andapointθ∗∈Θaroundwhichthelikelihoodconcentrates.Thismaybeintuitivelythoughtofasa``true′′valueofθ,andunderweakconditionswillbethepointthatmaximizestheexpectedlog-likelihood;weemphasizethatatnopointdoweassumethemodeliscorrectlyspecifiedinthesensethatPN∗isrecoveredbyπ(θ;y)