This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On a Multi-Year Microlevel Collective Risk Model

Rosy Oh [email protected] Himchan Jeong [email protected] Jae Youn Ahn [email protected] Emiliano A. Valdez [email protected] Institute of Mathematical Sciences, Ewha Womans University, Seoul, Korea. Department of Statistics, Ewha Womans University, Seoul, Korea. Department of Mathematics, University of Connecticut, Connecticut, United States
Abstract

For a typical insurance portfolio, the claims process for a short period, typically one year, is characterized by observing frequency of claims together with the associated claims severities. The collective risk model describes this portfolio as a random sum of the aggregation of the claim amounts. In the classical framework, for simplicity, the claim frequency and claim severities are assumed to be mutually independent. However, there is a growing interest in relaxing this independence assumption which is more realistic and useful for the practical insurance ratemaking. While the common thread has been capturing the dependence between frequency and aggregate severity within a single period, the work of Oh et al., 2020a provides an interesting extension to the addition of capturing dependence among individual severities. In this paper, we extend these works within a framework where we have a portfolio of microlevel frequencies and severities for multiple years. This allows us to develop a factor copula model framework that captures various types of dependence between claim frequencies and claim severities over multiple years. It is therefore a clear extension of earlier works on one-year dependent frequency-severity models and on random effects model for capturing serial dependence of claims. We focus on the results using a family of elliptical copulas to model the dependence. The paper further describes how to calibrate the proposed model using illustrative claims data arising from a Singapore insurance company. The estimated results provide strong evidence of all forms of dependencies captured by our model.

1 Introduction

According to Klugman et al., (2012), the aggregate loss in the classical collective risk model is defined as S=i=1NYi,S=\sum_{i=1}^{N}Y_{i}, where NN means the number of claim and YiY_{i} denotes ithi^{th} individual claim amounts over a fixed period of time with the following assumptions:

  1. 1.

    Conditional on N=nN=n, the random variables Y1,Y2,,YnY_{1},Y_{2},\ldots,Y_{n} are i.i.d. random variables.

  2. 2.

    Conditional on N=nN=n, the common distribution of the random variables Y1,Y2,,YnY_{1},Y_{2},\ldots,Y_{n} does not depend on nn.

  3. 3.

    The distribution of NN does not depend in any way on the values of YiY_{i}.

These assumptions might be convenient in terms of computational ease, however, such simplifying assumptions often lead to bias issues especially when used for risk classification. In relaxing such assumptions, various models have been proposed in the insurance literature. An interesting method to model the dependence in the collective risk model is the so-called two-part dependent frequency-severity model as suggested by Frees et al., 2014a . In this model, the dependence is incorporated by using frequency as an explanatory variable in the severity component. A similar approach has been used by Frees et al., 2011a in the modeling and prediction of frequency and severity of health care expenditure. Shi et al., (2015) suggested a three-part framework in order to capture the association between frequency and severity components. When generalized linear models (GLMs) are used with the number of claims treated as a covariate in claims severity, Garrido et al., (2016) showed that the pure premium includes a correction term for inducing dependence. When analyzing bonus-malus data, an interesting observation was made by Park et al., (2018) that dependence between claim frequency and severity is driven by the desire to reach a better bonus-malus class.

Applications of copula methods to capture dependence have been recently used in collective risk models. A majority of work in this area focused on modeling the dependence between frequency and average severity with parametric copulas. For example, Czado et al., (2012) used Gaussian copulas to extend traditional compound Poisson-Gamma two-part model and incorporated possible dependence. Krämer et al., (2013) suggested a similar joint copula-based approach and interestingly observed that ignoring dependence causes a severe underestimation of total loss in a portfolio. Frees et al., (2016) extended the copula-based approach to dependent frequency and average severity using claims data with multiple lines of insurance business. While their findings suggested weak association between frequency and average severity, they concluded that there are strong dependencies among the lines of business.

Unlike choosing a suitable family of marginal distributions, it is usually much harder to choose the correct family of copulas when calibrating these dependent models with data. The work of Krämer et al., (2013) investigated test procedures for the selection of a suitable family of copulas in a dependent frequency and average severity model. However, Oh et al., 2020a illustrated that indeed it is even more difficult to choose the appropriate dependence structure between frequency and average severity that includes the classical collective risk model as a special case. In particular, even under the most naive assumption of independence between frequency and individual severities, choosing the correct parametric copula presents some challenges. Inspired by this phenomenon, Oh et al., 2020a and Cossette et al., (2019) discussed the construction of single year collective risk models with microlevel data to provide a suitable dependence structure between the frequency and severity components. In part, the extension in this paper that captures dependence of various types of dependence between claim frequency and claim severity over multiple years is motivated by the work of Oh et al., 2020a .

In insurance industry, it is important to model the longitudinal property of the insurance losses to predict the fair premium in the future based on each policyholder’s historical claims information. However, the existing copula methods in the literature cannot be directly applied in prediction of the premium due to at least one of the following difficulties:

  • 1.

    Limited to the analysis of data over a single period or cross-sectional data,

  • 2.

    The choice of the copula family to provide a suitable dependence structure between claim frequency and average claim severity can be difficult.

Alternatively, the random effect model can be used to model the longitudinal property of the insurance losses. Hernández-Bastida et al., (2009) and Oh et al., 2020b used the shared random effects model to construct the dependence in a collective risk model, where independence between claim frequency and severity conditional on the random effect is assumed and the dependence structure is naturally derived by the shared random effects. Jeong and Valdez, (2020) derived a closed form of credibility premium for compound loss which captures not only the dependence between frequency and severity but also dependence among the multi-year claims of the same policyholder. However, it is known that the overdispersion and serial dependence can be compounded in the random effect model. Such compounded effect of the random effect can possibly result in pseudo or fake dependence structure in the claims, which in turn leads to the poor prediction of the premium (Denuit et al.,, 2007; Murray et al.,, 2013; Lee et al.,, 2020).

In this regard, as a natural extension of shared random effects model and one-year dependent compound risk model, we propose a multi-year framework with microlevel data so that we may incorporate the following dependencies simultaneously:

  • 1.

    dependence between a frequency and a severity within a year,

  • 2.

    dependence between two distinct severities within a year,

  • 3.

    dependence among frequencies across years,

  • 4.

    dependence between a frequency and a severity in different years,

  • 5.

    dependence between two severities in different years.

Specifically, we use a factor copula representation, which can be viewed as a copula model version of the random effect model (Krupskii and Joe,, 2013, 2015), by using 1-year microlevel model as building blocks.

The remainder of this paper is organized as follows. In Section 2, we propose a generalized shared random effects framework for multi-year microlevel collective risk model that incorporates all types of dependencies previously described. We demonstrate that previous methods for dependence modeling can be considered as special cases of our proposed model. In Section 3, we provide a concrete example of our proposed model with elliptical copulas. Because of simplicity, we focus on the family of Gaussian copulas to further explore various correlation structures that satisfy our framework. In Section 4, an empirical analysis with a special case of our proposed model is conducted with a dataset from an automobile insurance company. Concluding remarks are provided in Section 5 with some future directions of research.

2 Construction of the shared random effect parameter model

2.1 A motivating illustration

While copula methods are flexible in modeling the dependence, the “actual” flexibility comes from the proper choice of the parametric copula family. Although one may consider using the nonparametric copula method for the full flexibility in choosing a copula structure, modeling and interpreting dependence based on the non-parametric copula can be difficult as long as the discrete random variables are involved mainly due to the lack of uniquness (Genest and Nešlehová,, 2007). While recent study in Yang et al., (2019) provides the safe copula estimation method for discrete outcomes in a regression context, it is known to suffer from the so-called curse of dimensionality.

Indeed, as shown in Oh et al., 2020a , it is difficult to choose a proper parametric copula family for the frequency and average severity even under the most naive assumption, the case where frequency and individual severities are independent. This subsection summarizes the example in Oh et al., 2020a to explain such difficult and the necessity to use microlevel claims information.

Consider the classical collective risk model where frequency NN and the individual severity YjY_{j}s are assumed to be independent. Further, assume that NN is a positive integer valued random variable with

(N=n)=15,forn=1,2,3,4,5,{\mathbb{P}}\left(N=n\right)=\frac{1}{5},\quad\hbox{for}\quad n=1,2,3,4,5,

and

Y1,,YN|Ni.i.d.Gamma(ξ,ψ).Y_{1},\cdots,Y_{N}\big{|}N\buildrel{\rm i.i.d.}\over{\sim}{\rm Gamma}(\xi,\psi). (1)

Then, (1) implies

M|NGamma(ξ,ψ/N).M\big{|}N\sim{\rm Gamma}(\xi,\psi/N).

Clearly, NN and MM are not independent even though frequency and individual severities are independent. Since NN is discrete, the visualization and interpretation of the corresponding copula density function for (N,M)(N,M) can be difficult. Alternatively, Oh et al., 2020a provides the density function for the jittered version of (N,M)(N,M) as shown in Figure 1 where x-axis and y-axis corresponds to frequency NN and the average severity MM, respectively.

Refer to caption
(a)
Figure 1: Contour plot, in Oh et al., 2020a , of jittered copula density corresponding to (N,M)(N,M) using a kernel density estimation

Let (U1,U2)(U_{1},U_{2}) be a bivariate random vector sampled from the copula of the jittered version of (N,M)(N,M). As shown in Figure 1, the density of the copula tends to be smaller in the middle part of U2U_{2} when U1U_{1} is smaller, wheras the density tends to be larger in the middle part of U2U_{2} when U1U_{1} is larger. Therefore, it is straightforward to see that conditional variance of MM decreases as NN increases in Figure 1, which is quite intuitive since Var(M|N)=ξ2ψ/N{\mathrm{V}ar}\left(M|N\right)=\xi^{2}\psi/N in this case.

This example illustrates that we can see that most existing copulas, including Gaussian and Archimedean copulas, are unable to accommodate the dependence between frequency and average severity properly. This is a motivation for the modeling the dependence based on the microlevel claims information rather than summarized claims information. We refer the readers to Oh et al., 2020a for more details of this example and the detailed construction of the jittered version of (N,M)(N,M).

2.2 Data structure and model specification

For non-life insurance, claims observed are typically a history of frequencies and severities for multiple years. For a policyholder observed for τ\tau years, we have n1,,nτn_{1},\cdots,n_{\tau} which stand for frequency for each year, and corresponding individual severities (𝒚1,,𝒚τ)(\boldsymbol{y}_{1},\cdots,\boldsymbol{y}_{\tau}) where

𝒚t={not defined,nt=0;(yt,1,,yt,nt),nt>0;\boldsymbol{y}_{t}=\begin{cases}\hbox{not defined},&n_{t}=0;\\ (y_{t,1},\cdots,y_{t,{n_{t}}}),&n_{t}>0;\end{cases}

We find it convenient to define the following symbols for the description of data.

Define a random vector of length Nt+1N_{t}+1

𝒁t:={(Nt,Yt,1,,Yt,Nt),Nt1;0,Nt=0,\boldsymbol{Z}_{t}:=\begin{cases}\left(N_{t},Y_{t,1},\cdots,Y_{t,N_{t}}\right),&N_{t}\geq 1;\\ 0,&N_{t}=0,\end{cases}

and the realization of 𝒁t\boldsymbol{Z}_{t} is denoted as

𝒛t:={(nt,𝒚t),nt1;0,nt=0.\boldsymbol{z}_{t}:=\begin{cases}\left(n_{t},\boldsymbol{y}_{t}\right),&n_{t}\geq 1;\\ 0,&n_{t}=0.\end{cases}

Furthermore, multi-year extension of 𝒁t\boldsymbol{Z}_{t} is defined as

𝒁(τ):=(𝒁1,,𝒁τ)\boldsymbol{Z}_{({\tau})}:=\left(\boldsymbol{Z}_{1},\cdots,\boldsymbol{Z}_{\tau}\right)

and the realization of 𝒁(τ)\boldsymbol{Z}_{(\tau)} is denoted as

𝒛(τ):=(𝒛1,,𝒛τ).\boldsymbol{z}_{(\tau)}:=\left(\boldsymbol{z}_{1},\cdots,\boldsymbol{z}_{\tau}\right).

In the subsequent, we describe a shared random effect parameter model for modeling the type of claims data we observe that primarily consist of frequencies and severities for multiple years.

Model 1 (The copula linked shared random effect model).

Consider the following random effect model for 𝐙t\boldsymbol{Z}_{t} where the joint distribution between the observed losses and the shared random effect is presented with copulas.

  1. i.

    Shared random effect RR follows a probability distribution with density π\pi.

  2. ii.

    Conditional on R=rR=r, we have that 𝒁t\boldsymbol{Z}_{t}\, for t=1,t=1,\cdots are independent observations whose distribution function is given by

    Ht(𝒛t|r):=C(θ3,θ4)(Ft(nt|r),Gt,1(yt,1|r),,Gt,nt(yt,nt|r))H_{t}\left(\boldsymbol{z}_{t}|r\right):=C_{(\theta_{3},\theta_{4})}\left(F_{t}(n_{t}|r),G_{t,1}(y_{t,1}|r),\ldots,G_{t,n_{t}}(y_{t,n_{t}}|r)\right) (2)

    where FtF_{t} and Gt,jG_{t,j} means marginal cumulative distribution functions of NtN_{t} and Yt,jY_{t,j}, respectively and gt,jg_{t,j} means joint density function of Yt,jY_{t,j}. As a result, we have the following distribution function of 𝒁(τ)\boldsymbol{Z}_{(\tau)}

    H(𝒛(τ)):=t=1τHt(𝒛t|r)π(r)dr.H(\boldsymbol{z}_{(\tau)}):=\int\prod\limits_{t=1}^{\tau}H_{t}\left(\boldsymbol{z}_{t}|r\right)\pi(r){\rm d}r.
  3. iii.

    The parameters θ3\theta_{3} and θ4\theta_{4} of the copula C(θ3,θ4)C_{(\theta_{3},\theta_{4})} controls the independence between the frequency and severities and independence among individual severities, respectively, within a year so that we have

    ht(𝒛t|r)=ft(nt|r)gt[joint](𝒚t|r) if and only if θ3=0,{h_{t}{(\boldsymbol{z}_{t}|r)}}=f_{t}(n_{t}|r)g_{t}^{\rm[joint]}(\boldsymbol{y}_{t}|r)\ \text{ if and only if }\ \theta_{3}=0,

    where

    gt(𝒚t|r)=j=1Ntgt,j(yt,j|r) if and only if θ4=0,g_{t}{(\boldsymbol{y}_{t}|r)}=\prod\limits_{j=1}^{N_{t}}g_{t,j}{(y_{t,j}|r)}\ \text{ if and only if }\ \theta_{4}=0,

    and gt[joint]g_{t}^{\rm[joint]} means joint density function of 𝒀t\boldsymbol{Y}_{t}.

  4. iv.

    NtRN_{t}\perp R\, for t=1,t=1,\ldots if and only if θ1=0\theta_{1}=0.

  5. v.

    𝒚tR\boldsymbol{y}_{t}\perp R\, for t=1,t=1,\ldots if and only if θ2=0\theta_{2}=0.

Figure 2 illustrates the dependence structure of our proposed model. In this figure we show that shared random effect RR induces the types of dependence that are of interest to us. To illustrate, RR is linked to the number of claims across years, (N1,,Nτ)(N_{1},\ldots,N_{\tau}), through Cθ1C_{\theta_{1}} so that θ1\theta_{1} is a parameter which captures dependence among claim counts between years. Likewise, RR is linked to the individual amounts of claims across years, (𝒀1,,𝒀τ)(\boldsymbol{Y}_{1},\ldots,\boldsymbol{Y}_{\tau}), through Cθ2C_{\theta_{2}} so that θ2\theta_{2} is a parameter which captures dependence among claim amounts within and across the years. Furthermore, Cθ1C_{\theta_{1}} combined with Cθ2C_{\theta_{2}} introduces the dependence between the claim counts and individual severities within and across the years.

While, via the shared random effect RR, the parameters θ1\theta_{1} and θ2\theta_{2} universally capture dependence among the claims across the years, the other parameters θ3\theta_{3} and θ4\theta_{4} specifically capture dependence within the claims of the same year. That is, θ3\theta_{3} is a parameter which incorporates the dependence between the claim count and claim amounts within a year whereas θ4\theta_{4} is a parameter which incorporates the dependence among claim amounts within a year. Similarly, θ3\theta_{3} combined with θ4\theta_{4} affects the dependence between the claim counts and individual severities within the year. As a result, while dependence among the claims in different years are modeled by (θ1,θ2)(\theta_{1},\theta_{2}) only, the dependence among the claims in the same year are modeled by both (θ1,θ2)(\theta_{1},\theta_{2}) and (θ3,θ4)(\theta_{3},\theta_{4}). Note that our framework is distinguished from some existing work on dependence modeling with copulas such as Shi and Yang, (2018) and Lee and Shi, (2019), where average severity in the form of summarized data was used for modeling and implicitly precluded independence among the individual severities within the same year.

The idea of our multi-year microlevel collective risk model is that the observed claim for year tt, ZtZ_{t}, are independent for t=1,,τt=1,\ldots,\tau given the shared random effect R=rR=r described as follows:

h(𝒛(τ)|r)=t=1τht(zt|r)\xLongrightarrowθ3=0t=1τ[ft(nt|r)gt[joint](𝒚t|r)]\xLongrightarrowθ3=θ4=0t=1τ[ft(nt|r){j=1ntgt,j(yt,j|r)}],h{(\boldsymbol{z}_{(\tau)}|r)}=\prod\limits_{t=1}^{\tau}h_{t}({z_{t}|r})\xLongrightarrow{\theta_{3}=0}\prod\limits_{t=1}^{\tau}\left[f_{t}(n_{t}|r)g_{t}^{\rm[joint]}\left(\boldsymbol{y}_{t}|r\right)\right]\xLongrightarrow{\theta_{3}=\theta_{4}=0}\prod\limits_{t=1}^{\tau}\left[f_{t}(n_{t}|r)\left\{\prod_{j=1}^{n_{t}}g_{t,j}\left({y}_{t,j}|r\right)\right\}\right],

and

h(𝒛(τ))\displaystyle h(\boldsymbol{z}_{(\tau)}) =h(𝒛(τ)|r)π(r)dr\xLongrightarrowθ3=0t=1τ[ft(nt|r)gt[joint](𝒚t|r)]π(r)dr\displaystyle=\int h{(\boldsymbol{z}_{(\tau)}|r)}\pi(r){\rm d}{r}\xLongrightarrow{\theta_{3}=0}\int\prod\limits_{t=1}^{\tau}\left[f_{t}(n_{t}|r)g_{t}^{\rm[joint]}\left(\boldsymbol{y}_{t}|r\right)\right]\pi(r){\rm d}{r} (3)
\xLongrightarrowθ3=θ4=0t=1τ[ft(nt|r){j=1ntgt,j(yt,j|r)}]π(r)dr,\displaystyle\xLongrightarrow{\theta_{3}=\theta_{4}=0}\int\prod\limits_{t=1}^{\tau}\left[f_{t}(n_{t}|r)\left\{\prod_{j=1}^{n_{t}}g_{t,j}\left({y}_{t,j}|r\right)\right\}\right]\pi(r){\rm d}{r},

which is straightforward from iii and iv of Model 1.

Refer to caption
Figure 2: Visual representation of the multi-year microlevel shared random effect model

We note that this construction is similar to the model described by Krupskii and Joe, (2013), which develops a factor copula model conditionally on a set of latent variables. In some sense, according to their paper, our approach leads to a one-factor copula model presented in Section 3. The primary difference in our approach is the clear intuitive interpretation of our model to describe the various types of dependence in a dependent collective risk model. The well-definedness of Model 1 will also be discussed in Remark 1 in Section 3.

2.3 Special cases

It is immediate to see that the classical collective risk model of Klugman et al., (2012) is a special case of our proposed model where θ1=θ2=θ3=θ4=0\theta_{1}=\theta_{2}=\theta_{3}=\theta_{4}=0. This is the case when all frequencies and severities are mutually independent. Baumgartner et al., (2015) proposed shared random effects model to capture association between frequency and the average severity, which is just another special case of our proposed model. This is the case when θ3=θ4=0\theta_{3}=\theta_{4}=0. Finally, it is also easy to check that single-year microlevel collective risk model, proposed by Oh et al., 2020a , is another special case of our proposed model. This is when θ1=θ2=0\theta_{1}=\theta_{2}=0. In this regard, our proposed framework is quite comprehensive that allows other dependence models that have appeared in the literature as special cases.

3 Factor copula model based on the elliptical distributions

Copulas generated by elliptical distributions, also called elliptical copulas, have the correlation matrix as the primary parameter describing dependence between the components. The Gaussian and tt copulas belong to the family of elliptical copulas. We refer to Landsman and Valdez, (2003) for other choices of elliptical copulas including the copulas generated from multivariate Cauchy or multivariate logistic distribution. In this section, for simplicity, apparent ease of computations, and steering clear of distractions from the general case, we focus on the case of Gaussian copulas. In B, we illustrate how Gaussian copulas in multi-year microlevel collective risk model can be generalized into the elliptical copulas by providing an example of tt copula among other choices of elliptical copulas. Specifically, we consider Gaussian copulas with a specific covariance matrix to accommodate the dependence structure of multi-year microlevel collective risk model, and show that such Gaussian copula models can be represented as factor copula models. For the use in elliptical copulas including the Gaussian and tt copulas in mind, we begin with describing dependence structure via correlation matrices.

3.1 Dependence structure via correlation matrix

We start with definition of symbols. Denote \mathbb{N}, 0\mathbb{N}_{0}, \mathbb{R}, and +\mathbb{R}^{+} by the set of positive integer, the set of non-negative integer, the set of real number, and the set of positive real number, respectively.

For a n×mn\times m matrix 𝑴\boldsymbol{M}, we denote (i,j)(i,j)-th component of 𝑴\boldsymbol{M} as [𝑴]ij\left[\boldsymbol{M}\right]_{ij}. For a row vector 𝒗\boldsymbol{v} of length nn, we denote the ii-th component of 𝒗\boldsymbol{v} as [𝒗]i\left[\boldsymbol{v}\right]_{i}. For nn\in\mathbb{N}, define 𝟏n{\bf 1}_{n} and 𝑱n×n\boldsymbol{J}_{n\times n} as a column vector of 11 with length nn and a n×nn\times n matrix of ones, respectively. We use 𝑰n\boldsymbol{I}_{n} for nn\in\mathbb{N} to represent the n×nn\times n identity matrix.

Suppose 𝚺1,1\boldsymbol{\Sigma}_{1,1}, 𝚺1,2\boldsymbol{\Sigma}_{1,2}, 𝚺2,1\boldsymbol{\Sigma}_{2,1}, and 𝚺2,2\boldsymbol{\Sigma}_{2,2} are ×\ell\times\ell, ×m\ell\times m, m×m\times\ell, and m×mm\times m matrices, respectively. Define (+m)×(+m)(\ell+m)\times(\ell+m) matrix 𝚺\boldsymbol{\Sigma} as

𝚺=(𝚺1,1𝚺1,2𝚺2,1𝚺2,2)\boldsymbol{\Sigma}=\left(\begin{array}[]{cc}\boldsymbol{\Sigma}_{1,1}&\boldsymbol{\Sigma}_{1,2}\\ \boldsymbol{\Sigma}_{2,1}&\boldsymbol{\Sigma}_{2,2}\\ \end{array}\right)

If 𝚺2,2\boldsymbol{\Sigma}_{2,2} is invertible, the Schur complement of the block 𝚺2,2\boldsymbol{\Sigma}_{2,2} of the matrix 𝚺\boldsymbol{\Sigma} is the ×\ell\times\ell matrix defined by

𝚺//𝚺1,1:=𝚺2,2𝚺2,1(𝚺1,1)1𝚺1,2.\boldsymbol{\Sigma}\mathbin{\!/\mkern-5.0mu/\!}\boldsymbol{\Sigma}_{1,1}:=\boldsymbol{\Sigma}_{2,2}-\boldsymbol{\Sigma}_{2,1}\left(\boldsymbol{\Sigma}_{1,1}\right)^{-1}\boldsymbol{\Sigma}_{1,2}.
Definition 1.

For 𝐧=(n1,,nτ)(0)τ\boldsymbol{n}=(n_{1},\cdots,n_{\tau})\in\left(\mathbb{N}_{0}\right)^{\tau} and 𝛒=(ρ1,,ρ5)5\boldsymbol{\rho}=(\rho_{1},\cdots,\rho_{5})\in\mathbb{R}^{5} define the partitioned matrix

𝚺(𝒏)(𝝆):=(𝚺11(𝝆)𝚺1τ(𝝆)𝚺τ1(𝝆)𝚺ττ(𝝆)).\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})}:=\left(\begin{array}[]{ccc}\boldsymbol{\Sigma}_{11}^{(\boldsymbol{\rho})}&\cdots&\boldsymbol{\Sigma}_{1\tau}^{(\boldsymbol{\rho})}\\ \vdots&\ddots&\vdots\\ \boldsymbol{\Sigma}_{\tau 1}^{(\boldsymbol{\rho})}&\cdots&\boldsymbol{\Sigma}_{\tau\tau}^{(\boldsymbol{\rho})}\\ \end{array}\right). (4)

For i=1,,τi=1,\cdots,\tau, the matrix 𝚺tt(𝛒)\boldsymbol{\Sigma}_{tt}^{(\boldsymbol{\rho})} is a (nt+1)×(nt+1)(n_{t}+1)\times(n_{t}+1) matrix defined as

[𝚺tt(𝝆)]m={1,=m;ρ2,m,min{,m}2;ρ1,elsewhere;\left[\boldsymbol{\Sigma}_{tt}^{(\boldsymbol{\rho})}\right]_{\ell m}=\begin{cases}1,&\ell=m;\\ \rho_{2},&\ell\neq m,\quad\min\{\ell,m\}\geq 2;\\ \rho_{1},&\hbox{elsewhere};\end{cases}

for ,m=1,,nt+1\ell,m=1,\cdots,n_{t}+1. Furthermore, for i,j=1,,ti,j=1,\cdots,t with iji\neq j, the matrix 𝚺ij\boldsymbol{\Sigma}_{ij} is a (ni+1)×(nj+1)(n_{i}+1)\times(n_{j}+1) matrix defined as

[𝚺tj(𝝆)]m={ρ3,=m=1;ρ5,min{,m}2;ρ4,elsewhere;\left[\boldsymbol{\Sigma}_{tj}^{(\boldsymbol{\rho})}\right]_{\ell m}=\begin{cases}\rho_{3},&\ell=m=1;\\ \rho_{5},&\min\{\ell,m\}\geq 2;\\ \rho_{4},&\hbox{elsewhere};\end{cases}

for =1,,nt+1\ell=1,\cdots,n_{t}+1 and m=1,,nj+1m=1,\cdots,n_{j}+1.

Example 1.

Consider the case 𝐧=(2,3)\boldsymbol{n}=(2,3). Then we can write out 𝚺(𝐧)(𝛒)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} by denoting 𝐧=(2,3)\boldsymbol{n}=(2,3) and 𝛒=(ρ1,,ρ5)5\boldsymbol{\rho}=(\rho_{1},\cdots,\rho_{5})\in\mathbb{R}^{5}. As a result, 𝚺(𝐧)(𝛒)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} is a 7×77\times 7 defined as

𝚺(𝒏)(𝝆):=(𝚺11(𝝆)𝚺12(𝝆)𝚺21(𝝆)𝚺22(𝝆))\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})}:=\left(\begin{array}[]{cc}\boldsymbol{\Sigma}_{11}^{(\boldsymbol{\rho})}&\boldsymbol{\Sigma}_{12}^{(\boldsymbol{\rho})}\\ \boldsymbol{\Sigma}_{21}^{(\boldsymbol{\rho})}&\boldsymbol{\Sigma}_{22}^{(\boldsymbol{\rho})}\\ \end{array}\right)

where

𝚺11(𝝆)=(1ρ1ρ1ρ11ρ2ρ1ρ21)𝚺12(𝝆)=(ρ3ρ4ρ4ρ4ρ4ρ5ρ5ρ5ρ4ρ5ρ5ρ5)and𝚺22(𝝆)=(1ρ1ρ1ρ1ρ11ρ2ρ2ρ1ρ21ρ2ρ1ρ2ρ21.)\boldsymbol{\Sigma}_{11}^{(\boldsymbol{\rho})}=\left(\begin{array}[]{ccc}1&\rho_{1}&\rho_{1}\\ \rho_{1}&1&\rho_{2}\\ \rho_{1}&\rho_{2}&1\\ \end{array}\right)\quad\boldsymbol{\Sigma}_{12}^{(\boldsymbol{\rho})}=\left(\begin{array}[]{cccc}\rho_{3}&\rho_{4}&\rho_{4}&\rho_{4}\\ \rho_{4}&\rho_{5}&\rho_{5}&\rho_{5}\\ \rho_{4}&\rho_{5}&\rho_{5}&\rho_{5}\\ \end{array}\right)\quad\hbox{and}\quad\boldsymbol{\Sigma}_{22}^{(\boldsymbol{\rho})}=\left(\begin{array}[]{cccc}1&\rho_{1}&\rho_{1}&\rho_{1}\\ \rho_{1}&1&\rho_{2}&\rho_{2}\\ \rho_{1}&\rho_{2}&1&\rho_{2}\\ \rho_{1}&\rho_{2}&\rho_{2}&1\\ \end{array}.\right)

Furthermore, from the above and the following

𝚺21(𝝆)=(𝚺12(𝝆))T,\boldsymbol{\Sigma}_{21}^{(\boldsymbol{\rho})}=\left(\boldsymbol{\Sigma}_{12}^{(\boldsymbol{\rho})}\right)^{\mathrm{T}},

we have

𝚺(𝒏)(𝝆)=(1ρ1ρ1ρ3ρ4ρ4ρ4ρ11ρ2ρ4ρ5ρ5ρ5ρ1ρ21ρ4ρ5ρ5ρ5\hdashlineρ3ρ4ρ41ρ1ρ1ρ1ρ4ρ5ρ5ρ11ρ2ρ2ρ4ρ5ρ5ρ1ρ21ρ2ρ4ρ5ρ5ρ1ρ2ρ11)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})}=\left(\begin{array}[]{ccc:cccc}1&\rho_{1}&\rho_{1}&\rho_{3}&\rho_{4}&\rho_{4}&\rho_{4}\\ \rho_{1}&1&\rho_{2}&\rho_{4}&\rho_{5}&\rho_{5}&\rho_{5}\\ \rho_{1}&\rho_{2}&1&\rho_{4}&\rho_{5}&\rho_{5}&\rho_{5}\\ \hdashline\rho_{3}&\rho_{4}&\rho_{4}&1&\rho_{1}&\rho_{1}&\rho_{1}\\ \rho_{4}&\rho_{5}&\rho_{5}&\rho_{1}&1&\rho_{2}&\rho_{2}\\ \rho_{4}&\rho_{5}&\rho_{5}&\rho_{1}&\rho_{2}&1&\rho_{2}\\ \rho_{4}&\rho_{5}&\rho_{5}&\rho_{1}&\rho_{2}&\rho_{1}&1\\ \end{array}\right)

In the matrix 𝚺(𝒏)(𝝆)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})}, each component will be used for modeling the correlation between frequencies and severities within and across years. For example, the partitioned matrix 𝚺tt(𝝆)\boldsymbol{\Sigma}_{tt}^{(\boldsymbol{\rho})} is a (nt+1)×(nt+1)(n_{t}+1)\times(n_{t}+1) matrix describing the correlation structure of the random vector (Nt,Yt,1,,Yt,nt)(N_{t},Y_{t,1},\cdots,Y_{t,n_{t}}). Specifically, ρ1\rho_{1} in 𝚺tt(𝝆)\boldsymbol{\Sigma}_{tt}^{(\boldsymbol{\rho})} is used for a correlation between a frequency and a severity in the tt-th year, and ρ2\rho_{2} in 𝚺tt(𝝆)\boldsymbol{\Sigma}_{tt}^{(\boldsymbol{\rho})} is used for a correlation among the severities in the tt-th year. On the other hand, the partitioned matrix 𝚺tj(𝝆)\boldsymbol{\Sigma}_{tj}^{(\boldsymbol{\rho})} is a (nt+1)×(nj+1)(n_{t}+1)\times(n_{j}+1) matrix describing the correlation structure between the random vectors (Nt,Yt,1,,Yt,nt)(N_{t},Y_{t,1},\cdots,Y_{t,n_{t}}) and (Nj,Yj,1,,Yj,nj)(N_{j},Y_{j,1},\cdots,Y_{j,n_{j}}). Specifically, ρ3\rho_{3} in [𝚺tj(𝝆)]11\left[\boldsymbol{\Sigma}_{tj}^{(\boldsymbol{\rho})}\right]_{11} is used for a correlation between the frequencies in the different years, and ρ4\rho_{4} in 𝚺tj(𝝆)\boldsymbol{\Sigma}_{tj}^{(\boldsymbol{\rho})} is used for a correlation between a frequency in different years. Finally, ρ5\rho_{5} in 𝚺tj(𝝆)\boldsymbol{\Sigma}_{tj}^{(\boldsymbol{\rho})} is used for a correlation between a frequency and a severity in different years. The following is summarization for the meaning of each correlation:

  • 1.

    ρ1\rho_{1}: correlation between a frequency and a severity within a year;

  • 2.

    ρ2\rho_{2}: correlation among two distinct severities within a year;

  • 3.

    ρ3\rho_{3}: correlation among frequencies across years;

  • 4.

    ρ4\rho_{4}: correlation between a frequency and a severity in different years;

  • 5.

    ρ5\rho_{5}: correlation between two severities in different years.

We finally note that 𝚺tt(𝝆)\boldsymbol{\Sigma}_{tt}^{(\boldsymbol{\rho})} only depends on (ρ1,ρ2)(\rho_{1},\rho_{2}) while 𝚺tj(𝝆)\boldsymbol{\Sigma}_{tj}^{(\boldsymbol{\rho})} for tjt\neq j only depends on (ρ3,ρ4,ρ5)(\rho_{3},\rho_{4},\rho_{5}). Hence, we find that it is convenient to use 𝚺tt(𝝆)\boldsymbol{\Sigma}_{tt}^{(\boldsymbol{\rho}^{*})} with 𝝆=(ρ1,ρ2)\boldsymbol{\rho}^{*}=(\rho_{1},\rho_{2}) to stand for 𝚺tt(𝝆)\boldsymbol{\Sigma}_{tt}^{(\boldsymbol{\rho})}, and similarly 𝚺tj(𝝆)\boldsymbol{\Sigma}_{tj}^{(\boldsymbol{\rho}^{*})} for tjt\neq j with 𝝆=(ρ3,ρ4,ρ5)\boldsymbol{\rho}^{*}=(\rho_{3},\rho_{4},\rho_{5}) to stand for 𝚺tj(𝝆)\boldsymbol{\Sigma}_{tj}^{(\boldsymbol{\rho})} in a clear context.

Definition 2.

For 𝐧=(n1,,nτ)0τ\boldsymbol{n}=(n_{1},\cdots,n_{\tau})\in\mathbb{N}_{0}^{\tau}, 𝛒=(ρ1,,ρ5)(1,1)5\boldsymbol{\rho}=(\rho_{1},\cdots,\rho_{5})\in(-1,1)^{5}, 𝛉=(θ1,θ2)(1,1)2\boldsymbol{\theta}=(\theta_{1},\theta_{2})\in(-1,1)^{2}, define the partitioned matrix as 𝚺(𝐧)(𝛒;𝛉)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho};\boldsymbol{\theta})} as

𝚺(𝒏)(𝝆,𝜽):=(𝑰1𝛀(𝒏)(𝜽)(𝛀(𝒏)(𝜽))T𝚺(𝒏)(𝝆))\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})}:=\left(\begin{array}[]{cc}\boldsymbol{I}_{1}&\boldsymbol{\Omega}_{(\boldsymbol{n})}^{(\boldsymbol{\theta})}\\ \left(\boldsymbol{\Omega}_{(\boldsymbol{n})}^{(\boldsymbol{\theta})}\right)^{\mathrm{T}}&\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})}\\ \end{array}\right) (5)

where 𝚺(𝐧)(𝛒)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} is defined in (4) and 𝛀(𝐧)(𝛉)\boldsymbol{\Omega}_{(\boldsymbol{n})}^{(\boldsymbol{\theta})} is a 1×(𝐧¯+τ)1\times(\bar{\boldsymbol{n}}+\tau) matrix which can be expressed based on the following partitioned matrix

𝛀(𝒏)(𝜽):=(𝛀n1(𝜽),,𝛀nτ(𝜽))\boldsymbol{\Omega}_{(\boldsymbol{n})}^{(\boldsymbol{\theta})}:=\left(\boldsymbol{\Omega}_{n_{1}}^{(\boldsymbol{\theta})},\cdots,\boldsymbol{\Omega}_{n_{\tau}}^{(\boldsymbol{\theta})}\right)

with 𝛀nt(𝛉)\boldsymbol{\Omega}_{n_{t}}^{(\boldsymbol{\theta})} being a 1×(nt+1)1\times(n_{t}+1) matrix given by

[𝛀nt(𝜽)]1:={θ1,=1;θ2,otherwise.\left[\boldsymbol{\Omega}_{n_{t}}^{(\boldsymbol{\theta})}\right]_{1\ell}:=\begin{cases}\theta_{1},&\ell=1;\\ \theta_{2},&\hbox{otherwise}.\end{cases}

In Definition 2, we have introduced two parameters θ1\theta_{1} and θ2\theta_{2}. We impose natural dependence for multiples years of observed claims by using the shared random effect RR, which will affect all frequency and severities in any calendar year. In this regard, θ1\theta_{1} will be served as correlation parameter between the random effect RR and a frequency, and θ2\theta_{2} will be served as correlation parameter between a random effect RR and each severity, as described in Figure 1.

Example 2.

Consider the case 𝐧=(2,3)02\boldsymbol{n}=(2,3)\in\mathbb{N}_{0}^{2}, then one can represent 𝚺𝐧(𝛒,𝛉)\boldsymbol{\Sigma}_{\boldsymbol{n}}^{(\boldsymbol{\rho},\boldsymbol{\theta})} as a partitioned matrix as

𝚺(𝒏)(𝝆,𝜽):=(𝑰1𝛀(𝒏)(𝜽)(𝛀(𝒏)(𝜽))T𝚺(𝒏)(𝝆))\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})}:=\left(\begin{array}[]{cc}\boldsymbol{I}_{1}&\boldsymbol{\Omega}_{(\boldsymbol{n})}^{(\boldsymbol{\theta})}\\ \left(\boldsymbol{\Omega}_{(\boldsymbol{n})}^{(\boldsymbol{\theta})}\right)^{\mathrm{T}}&\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})}\\ \end{array}\right)

where 𝚺𝐧(𝛒)\boldsymbol{\Sigma}_{\boldsymbol{n}}^{(\boldsymbol{\rho})} is in (4), and

𝛀𝒏(𝜽)=(θ1θ2θ2θ1θ2θ2θ2).\boldsymbol{\Omega}_{\boldsymbol{n}}^{(\boldsymbol{\theta})}=\left(\begin{array}[]{ccccccc}\theta_{1}&\theta_{2}&\theta_{2}&\theta_{1}&\theta_{2}&\theta_{2}&\theta_{2}\\ \end{array}\right).

Hence, we have

𝚺(𝒏)(𝝆,𝜽)=(1θ1θ2θ2θ1θ2θ2θ2\hdashlineθ11ρ1ρ1ρ3ρ4ρ4ρ4θ2ρ11ρ2ρ4ρ5ρ5ρ5θ2ρ1ρ21ρ4ρ5ρ5ρ5\hdashlineθ1ρ3ρ4ρ41ρ1ρ1ρ1θ2ρ4ρ5ρ5ρ11ρ2ρ2θ2ρ4ρ5ρ5ρ1ρ21ρ2θ2ρ4ρ5ρ5ρ1ρ2ρ11).\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})}=\left(\begin{array}[]{c:ccc:cccc}1&\theta_{1}&\theta_{2}&\theta_{2}&\theta_{1}&\theta_{2}&\theta_{2}&\theta_{2}\\ \hdashline\theta_{1}&1&\rho_{1}&\rho_{1}&\rho_{3}&\rho_{4}&\rho_{4}&\rho_{4}\\ \theta_{2}&\rho_{1}&1&\rho_{2}&\rho_{4}&\rho_{5}&\rho_{5}&\rho_{5}\\ \theta_{2}&\rho_{1}&\rho_{2}&1&\rho_{4}&\rho_{5}&\rho_{5}&\rho_{5}\\ \hdashline\theta_{1}&\rho_{3}&\rho_{4}&\rho_{4}&1&\rho_{1}&\rho_{1}&\rho_{1}\\ \theta_{2}&\rho_{4}&\rho_{5}&\rho_{5}&\rho_{1}&1&\rho_{2}&\rho_{2}\\ \theta_{2}&\rho_{4}&\rho_{5}&\rho_{5}&\rho_{1}&\rho_{2}&1&\rho_{2}\\ \theta_{2}&\rho_{4}&\rho_{5}&\rho_{5}&\rho_{1}&\rho_{2}&\rho_{1}&1\\ \end{array}\right).

Now, for 𝒏=(n1,,nτ)(0)τ\boldsymbol{n}=(n_{1},\cdots,n_{\tau})\in\left(\mathbb{N}_{0}\right)^{\tau} and 𝝆=(ρ1,,ρ5)5\boldsymbol{\rho}=(\rho_{1},\cdots,\rho_{5})\in\mathbb{R}^{5}, we consider reparameterization of a matrix 𝚺(𝒏)(𝝆)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} with

{ρ1=θ1θ2+θ3θ4ρ2=θ22+θ42ρ3=θ12ρ4=θ1θ2ρ5=θ22\begin{cases}\rho_{1}=\theta_{1}\theta_{2}+\theta_{3}\theta_{4}\\ \rho_{2}=\theta_{2}^{2}+\theta_{4}^{2}\\ \rho_{3}=\theta_{1}^{2}\\ \rho_{4}=\theta_{1}\theta_{2}\\ \rho_{5}=\theta_{2}^{2}\end{cases} (6)

for

𝜽=(θ1,,θ4)4.\boldsymbol{\theta}=\left(\theta_{1},\cdots,\theta_{4}\right)\in\mathbb{R}^{4}.

The following theorem provides some results related with reparameterization in (6).

Theorem 1.

For 𝐧=(n1,,nτ)0τ\boldsymbol{n}=(n_{1},\cdots,n_{\tau})\in\mathbb{N}_{0}^{\tau}, 𝛒=(ρ1,,ρ5)(1,1)5\boldsymbol{\rho}=(\rho_{1},\cdots,\rho_{5})\in(-1,1)^{5}, consider the Schur Complement of the block 𝐈1\boldsymbol{I}_{1} of the matrix 𝚺(𝐧)(𝛒,𝛉)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})} in (5) denoted as 𝐌:=𝚺(𝐧)(𝛒,𝛉)//𝐈1\boldsymbol{M}:=\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})}\mathbin{\!/\mkern-5.0mu/\!}\boldsymbol{I}_{1}. For convenience, consider the following block matrix representation of 𝐌\boldsymbol{M} as

𝑴=(𝑴11𝑴1τ𝑴τ1𝑴ττ)\boldsymbol{M}=\left(\begin{array}[]{ccc}\boldsymbol{M}_{11}&\cdots&\boldsymbol{M}_{1\tau}\\ \vdots&\ddots&\vdots\\ \boldsymbol{M}_{\tau 1}&\cdots&\boldsymbol{M}_{\tau\tau}\\ \end{array}\right) (7)

where 𝐌ij\boldsymbol{M}_{ij} is a ni×njn_{i}\times n_{j} matrix. Then, we have the following results.

  1. i.

    For any 𝒏(0)τ\boldsymbol{n}\in\left(\mathbb{N}_{0}\right)^{\tau}, 𝑴\boldsymbol{M} is a block diagonal matrix, i.e. 𝑴ij\boldsymbol{M}_{ij} is a ni×njn_{i}\times n_{j} zero matrix whenever iji\neq j, if and only if ρ3\rho_{3}, ρ4\rho_{4}, and ρ5\rho_{5} satisfy

    ρ3=θ12,ρ4=θ1θ2,andρ5=θ22.\rho_{3}=\theta_{1}^{2},\quad\rho_{4}=\theta_{1}\theta_{2},\quad\hbox{and}\quad\rho_{5}=\theta_{2}^{2}. (8)
  2. ii.

    A matrix 𝚺(𝒏)(𝝆,𝜽)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})} is positive definite and 𝑴\boldsymbol{M} is a block diagonal matrix for any 𝒏(0)τ\boldsymbol{n}\in\left(\mathbb{N}_{0}\right)^{\tau} if and only if 𝝆\boldsymbol{\rho} is represented as in (6) and satisfying

    θ12+θ32<1andθ22+θ42<1.\theta_{1}^{2}+\theta_{3}^{2}<1\quad\hbox{and}\quad\theta_{2}^{2}+\theta_{4}^{2}<1. (9)
  3. iii.

    A matrix 𝚺(𝒏)(𝝆)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} with the parametrization in (6) is positive definite for any 𝒏(0)τ\boldsymbol{n}\in\left(\mathbb{N}_{0}\right)^{\tau} if 𝜽\boldsymbol{\theta} satisfies (9).

Proof.

For the proof of part i, it suffices to show that if iji\neq j, then

𝑴ij=𝚺ij[𝛀ni(𝜽)]T𝛀nj(𝜽)\displaystyle\boldsymbol{M}_{ij}=\boldsymbol{\Sigma}_{ij}-[\boldsymbol{\Omega}_{n_{i}}^{(\boldsymbol{\theta})}]^{\mathrm{T}}\boldsymbol{\Omega}_{n_{j}}^{(\boldsymbol{\theta})}

by definition of 𝑴\boldsymbol{M} where 𝚺ij\boldsymbol{\Sigma}_{ij} and 𝛀nt(𝜽)\boldsymbol{\Omega}_{n_{t}}^{(\boldsymbol{\theta})} are defined in (5) and (7), respectively and it can be written as follows:

[𝑴ij]m={ρ3θ12,=m=1;ρ5θ22,min{,m}2;ρ4θ1θ2,elsewhere.\left[\boldsymbol{M}_{ij}\right]_{\ell m}=\begin{cases}\rho_{3}-\theta_{1}^{2},&\ell=m=1;\\ \rho_{5}-\theta_{2}^{2},&\min\{\ell,m\}\geq 2;\\ \rho_{4}-\theta_{1}\theta_{2},&\hbox{elsewhere}.\end{cases}

For the proof of part ii, by Schur decomposition, we have 𝚺(𝒏)(𝝆,𝜽)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})} is positive definite if and only if 𝑴\boldsymbol{M} is positive definite. Since 𝑴\boldsymbol{M} is a block diagonal matrix provided (6) is satisfied, checking the positive definiteness of 𝑴\boldsymbol{M} is equivalent to check whether 𝑴jj\boldsymbol{M}_{jj} is positive definite or not. Hence, a matrix 𝚺(𝒏)(𝝆,𝜽)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})} is positive definite and 𝑴\boldsymbol{M} is a block diagonal matrix for any 𝒏(0)τ\boldsymbol{n}\in\left(\mathbb{N}_{0}\right)^{\tau} if and only if 𝚺(tt)(𝝆)\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})} is positive definite for any t0t\in\mathbb{N}_{0} where

ρ1=ρ1θ1θ21θ121θ22andρ2=ρ2θ221θ22,θ1,θ2(1,1).\rho_{1}^{*}=\frac{\rho_{1}-\theta_{1}\theta_{2}}{\sqrt{1-\theta_{1}^{2}}\sqrt{1-\theta_{2}^{2}}}\quad\hbox{and}\quad\rho_{2}^{*}=\frac{\rho_{2}-\theta_{2}^{2}}{1-\theta_{2}^{2}},\quad\theta_{1},\theta_{2}\in(-1,1).

Following Corollary 1 in Oh et al., 2020a , we have positive definite 𝚺(tt)(𝝆)\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})} for any t0t\in\mathbb{N}_{0} if and only if

(ρ1)2<ρ2<1.\left(\rho_{1}^{*}\right)^{2}<\rho_{2}^{*}<1. (10)

Finally, simple argument shows that (10) with the condition θ1,θ2(1,1)\theta_{1},\theta_{2}\in(-1,1) is equivalent with

ρ1=θ1θ2+θ3θ4andρ2=θ22+θ42\rho_{1}=\theta_{1}\theta_{2}+\theta_{3}\theta_{4}\quad\hbox{and}\quad\rho_{2}=\theta_{2}^{2}+\theta_{4}^{2}

for

θ12+θ32<1andθ22+θ42<1.\theta_{1}^{2}+\theta_{3}^{2}<1\quad\hbox{and}\quad\theta_{2}^{2}+\theta_{4}^{2}<1.

The proof of part iii immediately follows from part i and ii. ∎

3.2 The special case of Gaussian copulas

Let FtF_{t} be non-negative integer-valued distribution functions with the respective probability mass functions ftf_{t} for tt\in\mathbb{N}. Let GtG_{t} and Gt,jG_{t,j} be non-negative real-valued distribution functions with respective probability densities gtg_{t} and gt,jg_{t,j} for t,jt,j\in\mathbb{N}. While it is not necessary but for simplicity, we assume Gtj=GtG_{tj}=G_{t} for any t,jt,j\in\mathbb{N}.

We use Φ\Phi and ϕ\phi to denote the standard normal distribution and the corresponding density function, respectively. For a vector 𝝁n\boldsymbol{\mu}\in\mathbb{R}^{n} and a n×nn\times n covariance matrix 𝚺\boldsymbol{\Sigma}, we use Φ𝝁,𝚺\Phi_{\boldsymbol{\mu},\boldsymbol{\Sigma}} to denote the distribution function of multivariate normal distribution with mean 𝝁\boldsymbol{\mu} and a covariance matrix 𝚺\boldsymbol{\Sigma}, and ϕ𝝁,𝚺\phi_{\boldsymbol{\mu},\boldsymbol{\Sigma}} to denote the corresponding density function. Now, we are ready to present the multi-year microlevel collective risk model where the Gaussian copula is used to model the dependence.

Model 2 (The Gaussian copula model for the multi-year microlevel collective risk model).

Suppose 𝛒\boldsymbol{\rho} satisfies (6). Then, consider the random vector 𝐙t\boldsymbol{Z}_{t} whose joint distribution function H(𝐳τ)H(\boldsymbol{z}_{\tau}) is given by the following copula model representation

H(𝒛(τ))=C(𝒏)(𝝆)(F1(n1),G1,1(y1,1),,G1,n1(y1,n1),Fτ(nτ),Gτ,1(yτ,1),,Gτ,nτ(yτ,nτ))H(\boldsymbol{z}_{(\tau)})=C_{(\boldsymbol{n})}^{(\boldsymbol{\rho})}\left(F_{1}(n_{1}),G_{1,1}(y_{1,1}),\cdots,G_{1,n_{1}}(y_{1,n_{1}}),\cdots F_{\tau}(n_{\tau}),G_{\tau,1}(y_{\tau,1}),\cdots,G_{\tau,n_{\tau}}(y_{\tau,n_{\tau}})\right) (11)

where C(𝐧)(𝛒)C_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} is a Gaussian copula with correlation matrix 𝚺(𝐧)(𝛒)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})}.

From Lemma 1, the matrix 𝚺(𝒏)(𝝆)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} is positive definite for any 𝝆\boldsymbol{\rho} satisfying (6). Hence, C(𝒏)(𝝆)C_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} in Model 2 is a valid Gaussian copula. One can see that the estimation of the parameters in (11) is involved with the calculation of multivariate Gaussian density function which depends on the length of the observer years. Let 𝒃=(b1,,bτ)\boldsymbol{b}=\left(b_{1},\cdots,b_{\tau}\right) be vertices where each bjb_{j} is equal to either njn_{j} or nj1n_{j}-1. Then the corresponding density function of the random vector of 𝒁(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})} at 𝒁(𝝉)=𝒛(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})}=\boldsymbol{z}_{(\boldsymbol{\tau})} in (11) is given by

h(𝒛(𝝉))=sgn(𝒃)𝒛¯+τy1,1y1,n1,y2,1y2,n2,,yτ,1,yτ,nτH(𝒛(𝝉))h(\boldsymbol{z}_{(\boldsymbol{\tau})})=\sum{\rm sgn}(\boldsymbol{b})\frac{\partial^{\bar{\boldsymbol{z}}+\tau}}{\partial y_{1,1}\cdots\partial y_{1,n_{1}},\partial y_{2,1}\cdots\partial y_{2,n_{2}},\cdots,\partial y_{\tau,1},\cdots\partial y_{\tau,n_{\tau}}}H\left(\boldsymbol{z}_{(\boldsymbol{\tau})}\right) (12)

where the sum is taken over all vertices 𝒃\boldsymbol{b}, and sgn(𝒃){\rm sgn}(\boldsymbol{b}) is given by

sgn(𝒃)={+1,if bj=nj1 for an even number of j’s;1,if bj=nj1 for an odd number of j’s.{\rm sgn}(\boldsymbol{b})=\begin{cases}+1,&\hbox{if $b_{j}=n_{j}-1$ for an even number of $j$'s;}\\ -1,&\hbox{if $b_{j}=n_{j}-1$ for an odd number of $j$'s.}\end{cases}

Here, we note that calculation of the density function in (12) can be difficult due to the following aspects of our model.

  • 1.

    Due to the discrete nature of the frequency observations, one can immediately check that the computational complexity in (12) grows exponentially with τ\tau.

  • 2.

    The calculation of each summation in (12), which requires a numerical multivariate integration due to the nature of multivariate Gaussian function, can be even difficult especially in high dimensions (Genz and Bretz,, 2009)

However, here we avoid such difficulty by using the following copula representation which is inspired by factor copula representation in Krupskii and Joe, (2013), Nikoloulopoulos and Joe, (2015) and Kadhem and Nikoloulopoulos, (2019). For 𝝆\boldsymbol{\rho} defined in (6) satisfying (9), we extend the modeling of 𝒁(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})} by including the random effect RR so that the joint distribution of (R,𝒁(𝝉))\left(R,\boldsymbol{Z}_{(\boldsymbol{\tau})}\right) is given by

H(r,𝒛(𝝉))\displaystyle H^{*}\left(r,\boldsymbol{z}_{(\boldsymbol{\tau})}\right) (13)
=C(𝒏)(𝝆,𝜽)(Φ(r),F1(n1),G1,1(y1,1),,G1,n1(y1,n1),Fτ(nτ),Gτ,1(yτ,1),,Gτ,nτ(y1,nτ)).\displaystyle=C_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})}\left(\Phi(r),F_{1}(n_{1}),G_{1,1}(y_{1,1}),\cdots,G_{1,n_{1}}(y_{1,n_{1}}),\cdots F_{\tau}(n_{\tau}),G_{\tau,1}(y_{\tau,1}),\cdots,G_{\tau,n_{\tau}}(y_{1,n_{\tau}})\right).

Naturally, by the property of the copula C(𝒏)(𝝆,𝜽)C_{(\boldsymbol{n})}^{(\boldsymbol{\rho},\boldsymbol{\theta})}, the joint distribution in (13) implies the joint distribution function in (11) in the following sense

H(𝒛(𝝉))=limrH(r,𝒛(𝝉)),H\left(\boldsymbol{z}_{(\boldsymbol{\tau})}\right)=\lim\limits_{r\rightarrow\infty}H^{*}\left(r,\boldsymbol{z}_{(\boldsymbol{\tau})}\right),

which further implies that the random vector (R,𝒁(𝝉))\left(R,\boldsymbol{Z}_{(\boldsymbol{\tau})}\right) is a natural extension of the random vector 𝒁(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})}. Furthermore, reparameterization in (6) gives us a well-defined and natural dependence structure with the shared random effect R{R} so that claims across multiple years would be independent conditional on R{R}. For example, if (6) holds and θ1=θ2=0\theta_{1}=\theta_{2}=0, then one can see that 𝒁(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})} are not only conditionally independent but also marginally independent so that 𝒁t𝒁t\boldsymbol{Z}_{t}\perp\boldsymbol{Z}_{t^{\prime}} for all ttt\neq t^{\prime}. In addition to (6), if θ3=θ4=0\theta_{3}=\theta_{4}=0, then 𝑴\boldsymbol{M} is not only block-diagonal, but diagonal, which implies that frequency and severity are independent once the shared random effect RR is controlled. In other words, dependence between frequency and severity are fully explained by the shared random effect R{R}. Finally, if (6) holds and θ1=θ2=θ3=θ4=0\theta_{1}=\theta_{2}=\theta_{3}=\theta_{4}=0, then 𝚺(𝒏)(𝝆)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} is diagonal, which implies our model specification includes the traditional model, which assumes independence among the claims in different years and independence between the frequency and severity.

The following theorem shows us the key idea of our copula representation where the observed claim 𝒁t\boldsymbol{Z}_{t} for t=1,,τt=1,\ldots,\tau are independent conditional on the random effect RR, and can be fitted into the special case of Model 1. In this regard, the copula in (13) has similar spirit as a factor copula. The corresponding copula of the distribution of 𝒁t\boldsymbol{Z}_{t} conditional on RR is a Gaussian copula which can be represented as 1-factor copula. As a result, the distribution in (13) have 2-factor copula representation. However, such representation of the model increases the complexity of the notation while provides limited benefit in computational complexity, and hence we do not pursue such representation for the simplicity of the paper.

Theorem 2.

Suppose that 𝛒\boldsymbol{\rho} satisfies (6) and joint distribution function HH^{*} of the random vector (R,𝐙t)\left(R,\boldsymbol{Z}_{t}\right) is given by the factor copula model in (13). Then, we have the following results.

  1. i.

    The distribution function of 𝒁(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})} can be obtained as in (11).

  2. ii.

    The density function of 𝒁(𝝉)=𝒛(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})}=\boldsymbol{z}_{(\boldsymbol{\tau})} conditional on R=rR=r is given by

    h(𝒛(𝝉)|r)=t=1τht(𝒛t|r)h^{*}\left(\boldsymbol{z}_{(\boldsymbol{\tau})}|r\right)=\prod\limits_{t=1}^{\tau}h_{t}^{*}\left(\boldsymbol{z}_{t}|r\right)

    where ht(|r)h_{t}^{*}(\cdot|r) is the conditional density function of 𝒁t\boldsymbol{Z}_{t} conditional on R=rR=r.

  3. iii.

    NtRN_{t}\perp R\, for t=1,t=1,\ldots if and only if θ1=0\theta_{1}=0.

  4. iv.

    𝒚tR\boldsymbol{y}_{t}\perp R\, for t=1,t=1,\ldots if and only if θ2=0\theta_{2}=0.

Proof.

The proof of part i is trivial from the property of copula function. The proof of part ii, by the invariance property of the copula under the monotone transformation, we have that the corresponding copula CC of the conditional distribution of random vector 𝒁(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})} conditional on R=rR=r is again a Gaussian copula. Furthermore, knowing that CC is a Gaussian copula, Theorem 1 shows that 𝒁1,,𝒁τ\boldsymbol{Z}_{1},\cdots,\boldsymbol{Z}_{\tau} are independent conditional on R=rR=r. The proofs of part iii and iv are immediate from the property of Gaussian copulas. ∎

Based on this result in Theorem 2, one can obtain the joint density of (𝒁1,,𝒁τ)(\boldsymbol{Z}_{1},\cdots,\boldsymbol{Z}_{\tau}) just with a single dimensional (numerical) integration as the following manner.

Corollary 1.

Consider the random vector 𝐙t\boldsymbol{Z}_{t} under the settings in Model 2. Then, the joint density of 𝐙(𝛕)\boldsymbol{Z}_{(\boldsymbol{\tau})} is given as follows:

h(𝒛(𝝉))\displaystyle h(\boldsymbol{z}_{(\boldsymbol{\tau})})
=t=1τ[gt[joint](yt,1,,yt,nt|r)(Φ(Φ1(F(nt))μtσt)Φ(Φ1(F(nt1))μtσt))]ϕ(r)dr\displaystyle=\int\prod\limits_{t=1}^{\tau}\left[g_{t}^{\rm[joint]*}\left(y_{t,1},\cdots,y_{t,n_{t}}|r\right)\left(\Phi\left(\frac{\Phi^{-1}(F(n_{t}))-\mu_{t}}{\sigma_{t}}\right)-\Phi\left(\frac{\Phi^{-1}(F(n_{t}-1))-\mu_{t}}{\sigma_{t}}\right)\right)\right]\phi(r){\rm d}{r}

where

μt=(θ1,ρ1,,ρ1)(𝚺(tt)(𝝆))1(r,Φ1(G(yt,1)),,Φ1(G(yt,nt)))T\mu_{t}=\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)\left(\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})}\right)^{-1}\left(r,\Phi^{-1}\left(G(y_{t,1})\right),\cdots,\Phi^{-1}\left(G(y_{t,n_{t}})\right)\right)^{\mathrm{T}} (14)

and

(σt)2=1(θ1,ρ1,,ρ1)(𝚺(tt)(𝝆))1(θ1,ρ1,,ρ1)T\left(\sigma_{t}\right)^{2}=1-\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)\left(\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})}\right)^{-1}\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)^{\mathrm{T}} (15)

with ρ1=θ2\rho_{1}^{*}=\theta_{2} and ρ2=ρ2\rho_{2}^{*}=\rho_{2}. Here, gt[joint](|r)g_{t}^{\rm[joint]*}\left(\cdot|r\right) is the density function of 𝐘t\boldsymbol{Y}_{t} conditional on R=rR=r, and given by

gt[joint](yt,1,,yt,nt|r)\displaystyle g_{t}^{\rm[joint]*}\left(y_{t,1},\cdots,y_{t,n_{t}}|r\right)
=ϕ𝝁,𝚺(Φ1(G(yt,1)),,Φ1(G(yt,nt)))j=1ntg(yt,j)ϕ(Φ1(G(yt,j)))\displaystyle=\phi_{\boldsymbol{\mu^{*}},\boldsymbol{\Sigma}^{*}}\left(\Phi^{-1}\left(G\left(y_{t,1}\right)\right),\cdots,\Phi^{-1}\left(G\left(y_{t,n_{t}}\right)\right)\right)\prod\limits_{j=1}^{n_{t}}\frac{g(y_{t,j})}{\phi\left(\Phi^{-1}\left(G(y_{t,j})\right)\right)}

where

𝝁=rθ2𝟏ntand𝚺=(1ρ2)𝑰nt+(ρ2θ22)𝑱nt×nt.\boldsymbol{\mu^{*}}=r\,\theta_{2}\boldsymbol{1}_{n_{t}}\quad\hbox{and}\quad\boldsymbol{\Sigma}^{*}=(1-\rho_{2})\boldsymbol{I}_{n_{t}}+(\rho_{2}-\theta_{2}^{2})\boldsymbol{J}_{n_{t}\times n_{t}}.
Proof of Corollary 1.

According to Theorem 2, we extend 𝒁(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})} to the factor copula model (R,𝒁(𝝉))\left(R,\boldsymbol{Z}_{(\boldsymbol{\tau})}\right) having the distribution function in (13). Then, we have

h(𝒛(𝝉))\displaystyle h(\boldsymbol{z}_{(\boldsymbol{\tau})}) =h(𝒛(𝝉)|r)ϕ(r)dr\displaystyle=\int h^{*}(\boldsymbol{z}_{\boldsymbol{(\tau)}}|r)\phi(r){\rm d}r
=t=1τht(𝒛t|r)ϕ(r)dr\displaystyle=\int\prod\limits_{t=1}^{\tau}h_{t}^{*}(\boldsymbol{z}_{t}|r)\phi(r){\rm d}r
=t=1τ[gt[joint](yt,1,,yt,nt|r)(Nt=nt|yt,1,,yt,nt,r)]ϕ(r)dr\displaystyle=\int\prod\limits_{t=1}^{\tau}\left[g_{t}^{\rm[joint]*}\left(y_{t,1},\cdots,y_{t,n_{t}}|r\right){\mathbb{P}}\left(N_{t}=n_{t}|y_{t,1},\cdots,y_{t,n_{t}},r\right)\right]\phi(r){\rm d}r

where h(|r)h^{*}(\cdot|r), and ht(|r)h_{t}^{*}(\cdot|r) are the density functions of 𝒁(𝝉)\boldsymbol{Z}_{(\boldsymbol{\tau})} and 𝒁t\boldsymbol{Z}_{t}, respectively, and gt[joint](yt,1,,yt,nt|r)g_{t}^{\rm[joint]*}\left(y_{t,1},\cdots,y_{t,n_{t}}|r\right) is the density function of (Yt,1,,Yt,nt)\left(Y_{t,1},\cdots,Y_{t,n_{t}}\right) at (Yt,1,,Yt,nt)=(yt,1,,Yy,nt)\left(Y_{t,1},\cdots,Y_{t,n_{t}}\right)=\left(y_{t,1},\cdots,Y_{y,n_{t}}\right) conditional on R=rR=r. Here, the second equality is from Theorem 2, and the final equality is just conditional distribution expression of the joint density function.

Finally, it suffices to show that

(nt|yt,1,,yt,nt,r)\displaystyle{\mathbb{P}}\left(n_{t}|y_{t,1},\cdots,y_{t,n_{t}},r\right) =(Ntnt|r,yt,1,,yt,nt)(Ntnt1|r,yt,1,,yt,nt)\displaystyle=\mathbb{P}(N_{t}\leq n_{t}|r,y_{t,1},\cdots,y_{t,n_{t}})-\mathbb{P}(N_{t}\leq n_{t}-1|r,y_{t,1},\cdots,y_{t,n_{t}})
=Φ(Φ1(F(nt))μtσt)Φ(Φ1(F(nt1))μtσt)\displaystyle=\Phi\left(\frac{\Phi^{-1}(F(n_{t}))-\mu_{t}}{\sigma_{t}}\right)-\Phi\left(\frac{\Phi^{-1}(F(n_{t}-1))-\mu_{t}}{\sigma_{t}}\right)

and

gt[joint](yt,1,,yt,nt|r)=ϕ𝝁,𝚺(Φ1(G(yt,1)),,Φ1(G(yt,nt)))j=1ntg(yt,j)ϕ(Φ1(G(yt,j))),g_{t}^{\rm[joint]*}\left(y_{t,1},\cdots,y_{t,n_{t}}|r\right)=\phi_{\boldsymbol{\mu^{*}},\boldsymbol{\Sigma}^{*}}\left(\Phi^{-1}\left(G\left(y_{t,1}\right)\right),\cdots,\Phi^{-1}\left(G\left(y_{t,n_{t}}\right)\right)\right)\prod\limits_{j=1}^{n_{t}}\frac{g(y_{t,j})}{\phi\left(\Phi^{-1}\left(G(y_{t,j})\right)\right)},

which are proved by Lemmas 1 and 2, respectively in A. ∎

Remark 1.

The model specification in Model 2, and equivalently Model 1, is not casual in the sense that the length or dimension of the observation varies depending on the value of the observation. For example, we have 𝐳t=(nt,yt,1)\boldsymbol{z}_{t}=(n_{t},y_{t,1}) for nt=1n_{t}=1 while 𝐳t=(nt,yt,1,yt,2)\boldsymbol{z}_{t}=(n_{t},y_{t,1},y_{t,2}) for nt=2n_{t}=2. Hence, the model itself does not seem to be well-defined as it is not even clear how to mathematically define cumulative distribution function or the joint density function. In C, we show how to interpret the density function and corresponding distribution function in Model 2 so that they are well-defined. Specifically, one can easily check that the copula function C(𝐧)(𝛒)C_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} in Model 2 satisfies the inheritance property in the similar manner as in (C.2), which further implies that Model 2 can be reformulated as Model 4 where the distribution and density functions are well-defined. Finally, one can easily show that the distribution and density functions in Model 2 are the same as those in Model 4 so that they are well-defined. We leave the detailed discussion in C. The discussion on the well-definedness of Model 2 but limited to one-year model can be also found in Oh et al., 2020a .

4 Simulation study

In this section, we conduct a simulation study to investigate the finite sample properties of the parameter estimates and effects of the dependences on them for the proposed method based on Model 2. We assume one risk class only, where the distribution function FF follows Poisson distribution with mean parameter λ0\lambda_{0} and the distribution function GG follows Weibull distribution with mean paramter ξ0\xi_{0} and shape paramter ν\nu. Here, the parameters for the marginal part of severity are specified as ξ0=exp(8)\xi_{0}=\exp(8), and ν=0.7\nu=0.7, which are the same for all scenarios. The portfolio of policyholders of size II observed for three years (τ=3\tau=3) are generated from the proposed model under 8 scenarios motivated by the real data analysis in Section 5. Table 1 provides the rest of parameter settings and the corresponding correlation coefficients.

Table 1: Parameter settings of the copula part for each scenario
Parameter
Scenario I λ0\lambda_{0} θ1\theta_{1} θ2\theta_{2} θ3\theta_{3} θ4\theta_{4} ρ1\rho_{1} ρ2\rho_{2} ρ3\rho_{3} ρ4\rho_{4} ρ5\rho_{5}
1 500 2 0.3 0.3 0.5 0.5 0.34 0.34 0.09 0.09 0.09
2 0.3 0.3 0 0 0.09 0.09 0.09 0.09 0.09
3 0.3 0.7 0.5 0.5 0.46 0.74 0.09 0.21 0.49
4 0.3 0.7 0 0 0.21 0.49 0.09 0.21 0.49
5 0.7 0.3 0.5 0.5 0.46 0.34 0.49 0.21 0.09
6 0.7 0.3 0 0 0.21 0.09 0.49 0.21 0.09
7 0.7 0.7 0.5 0.5 0.74 0.74 0.49 0.49 0.49
8 0.7 0.7 0 0 0.49 0.49 0.49 0.49 0.49

For each scenario, Tables 2 and 3 summarize the simulation results from 500 independent Monte Carlo samples, including the relative bias and mean squared error (MSE) of the parameter estimates. Table 2 indicates that in all the scenarios, the estimates are close to the true parameters of the proposed model and shows that the relative bias and MSE are small.

Table 2: Relative bias in %\% for all the parameters from the each scenarios
RB (%\%)
Scenario λ0\lambda_{0} ξ0\xi_{0} ν\nu θ1\theta_{1} θ2\theta_{2} θ3\theta_{3} θ4\theta_{4}
1 -0.21 -0.06 0.17 -1.48 1.08 -0.06 -0.27
2 -0.26 -0.26 1.19 0.84 -0.98 - -
3 -0.52 -0.03 -10.77 -9.81 1.16 -1.19 0.10
4 0.05 -0.53 1.74 6.09 0.64 - -
5 0.00 -0.08 -0.29 0.33 1.43 -1.36 0.14
6 0.38 -0.31 18.92 -1.70 -0.63 - -
7 -0.16 -0.03 -20.82 -19.90 0.54 -0.63 -0.01
8 0.14 -0.50 13.95 9.68 1.00 - -
Table 3: Mean absolute error for all the parameters from the 12 scenarios
MSE
Scenario λ0\lambda_{0} ξ0\xi_{0} ν\nu θ1\theta_{1} θ2\theta_{2} θ3\theta_{3} θ4\theta_{4}
1 0.0015 0.0008 0.0023 0.0012 0.0021 0.0007 0.0001
2 0.0010 0.0009 0.0011 0.0004 0.0001 - -
3 0.0014 0.0009 0.0131 0.0039 0.0023 0.0008 0.0001
4 0.0015 0.0020 0.0023 0.0012 0.0001 - -
5 0.0014 0.0011 0.0038 0.0015 0.0033 0.0008 0.0001
6 0.0015 0.0009 0.0023 0.0006 0.0001 - -
7 0.0012 0.0011 0.0124 0.0064 0.0025 0.0007 0.0002
8 0.0017 0.0019 0.0069 0.0024 0.0001 - -

5 Empirical application

In this section, we now calibrate the proposed model to a real auto insurance dataset, to examine dependence structure (i) between frequency and severity within a year, (ii) among two distinct severities within a year, (iii) among frequencies across years, (iv) between frequency and severity in different years, and (v) between two severities in different years.

5.1 Data

For this empirical investigation, we employ a dataset from a general insurer in Singapore, which consists of a portfolio of personal automobile insurance policies with comprehensive coverages. The dataset has been obtained from General Insurance Association of Singapore, a trade association with representations from all the general insurance companies in Singapore. The claims experience observed from this dataset is longitudinal over a period of six years, from 1995 to 2000, and has 17,452 unique policyholders. Among the observations, we randomly sample 5000 policyholders. To calibrate the models, the observations for the first five years, 1995-1999 is used as in-sample, or training data, and in order to examine the performance of the model, we use the last year 2000 as the hold-out sample, or test data.

The dataset also contains a set of predictors that could further explain additional variation in the number of claims and the claim amounts. To summarize the variables observed, we have three categorical variables and one continuous variable: the gender with two levels (male and female), insured’s age (Age) with four levelss including age 1 (0,25]\in(0,25], age 2 (25,35]\in(25,35], age 3 (35,65]\in(35,65], and age 4 (65,]\in(65,\infty], vehicle age (VehAge) with four levels including vehicle age 1 [0,1]\in[0,1], vehicle age 2 (1,5]\in(1,5], vehicle age 3 (5,10]\in(5,10], and vehicle age 4 (10,]\in(10,\infty], and vehicle’s capacity expressed in log scale (logVehCapa).

Table 4 summarizes the description and simple statistics of these predictor variables which represent the risk characteristics of policyholders: Gender, Age, VehAge, and logVehCapa. In Singapore, as observed in this table, there is a disproportionate distribution by gender with more male than female drivers. When we the distribution of drivers by age, it is also not surprising to find fewer percentage of younger drivers, unlike that in other developed countries. The primary reason for this is the extremely expensive cost of owning and maintaining a car, in addition to the efficiency of the use of public transportation. During the period of observation, it is highly discourage to own a car for more than 10 years, and this reflected in this distribution.

Furthermore, a summary of the claim frequency over the years 1995 to 1999 is given in Table 5 and the average claim amount categorized by frequency and year is given in Table 6. This table suggests that the claims size appears to be unstable over time. We adjust the values of the individual severities, in order to satisfy that the average of individual severity over each year is the same as the average over the 2000 hold-out sample data with 4,659 observations.

Table 4: Observable policy characteristics used as covariates
Categorical Description Proportions
variables
Gender Insured’s sex:
      Male = 1 80.03%\%
      Female = 0 19.97%\%
Age The policyholder’s issue age :
      Age (0,   25]=1\in(0,\,\,\,25]=1 0.49%\%
      Age (25,35]=2\in(25,35]=2 21.68%\%
      Age (35,65]=3\in(35,65]=3 76.81%\%
      Age (65,]=4\in(65,\infty]=4 1.03%\%
VehAge Age of vehicle in years :
      VehAge [0,1]=1\in[0,\,\quad 1]=1 12.45%\%
      VehAge (1,5]=2\in(1,\quad 5]=2 57.30%\%
      VehAge (5,   10]=3\in(5,\,\,\,10]=3 29.99%\%
      VehAge (10,]=4\in(10,\infty]=4 0.25%\%
Continuous Min Mean Max
variables
logVehCapa Insured vehicle’s capacity in cc 6.49 7.19 8.82
Table 5: Number of observations by frequency and year
Train Test
Frequency 1995 1996 1997 1998 1999 Count %\% of Total 2000 %\% of Total
0 3103 3291 2501 2036 1751 12682 91.05 1360 92.39
1 232 212 266 214 219 1143 8.21 104 7.07
2 17 8 20 24 18 87 0.62 8 0.54
3 2 1 2 2 4 11 0.08 0 0
Count 3354 3512 2789 2276 1992 13923 100 4659 100
Table 6: Average severity by frequency and year
Train Test
Frequency 1995 1996 1997 1998 1999 Avg. severity 2000
1 4742 4530 4567 5440 3895 4630 4557
2 6319 3633 3629 3781 3644 4200 2950
3 2630 1687 4991 6015 3065 3747 -
4 - - - - - - -
Avg. severity 4892 4431 4455 5156 3824 4553 4046

5.2 Estimation result

For the data analysis, we consider the model with regression setting described in Corollary 1. We assume the distribution function, FF, follows a Poisson distribution with mean parameter, λ\lambda, for the frequency, and the distribution function, GG, follows a Weibull distribution with mean parameter, ξ\xi, and shape parameter, ν\nu, for the severity component. With a log link function, we therefore have

λ=exp(𝒙𝜷),andξ=exp(𝒘𝜸),\lambda=\exp(\boldsymbol{x}\boldsymbol{\beta}),\quad\hbox{and}\quad\xi=\exp(\boldsymbol{w}\boldsymbol{\gamma}),

where 𝒙\boldsymbol{x} and 𝒘\boldsymbol{w} are the vectors of model matrices for each policyholder 111In this example, 𝒙\boldsymbol{x} includes Gender, Age, and VehAge, and 𝒘\boldsymbol{w} includes VehCapa, Age, and VehAge., and 𝜷\boldsymbol{\beta} and 𝜸\boldsymbol{\gamma} are the corresponding parameters for the frequency and severity, respectively. Hence, in this data analysis, we consider following parameters: (𝜷,𝜸,ν,θ1,θ2,θ3,θ4)(\boldsymbol{\beta},\boldsymbol{\gamma},\nu,\theta_{1},\theta_{2},\theta_{3},\theta_{4}).

Table 7 presents the summary statistics for the model estimation results. This table provides details of the estimated parameters for the frequency part, the severity part, as well as the copula part. There are four measures detailed in this table: estimates (est), standard errors (std.error), t statistics (t), and corresponding p-values. Note that the asterisk sign (*) indicates that the estimate is significant at 0.05 level. From the table, the results are as expected. For example, despite the disproportionate percentage, male drivers are expected to incur more accidents than female drivers. When analyzed by age, broadly speaking, both frequency and severity tend to decrease with age. Elderly drivers, for example, have fewer number of accidents with smaller average costs per accident than drivers less than 25 years old.

Table 7: Estimation result
parameter est std.error t p-value
Frequency part
(Intercept) -2.237 0.289 -7.749 <<.0001 *
Gender 0.125 0.067 1.865 0.0623
VehAge2 0.048 0.101 0.477 0.6336
VehAge3 -0.146 0.109 -1.339 0.1806
VehAge4 0.835 0.443 1.886 0.0594
Age2 0.323 0.270 1.197 0.2313
Age3 0.156 0.268 0.583 0.5601
Age4 -0.569 0.460 -1.237 0.2161
Severity part
(Intercept) 3.889 0.938 4.148 <<.0001 *
logVehCapa 0.700 0.108 6.468 <<.0001 *
VehAge2 -0.010 0.097 -0.099 0.9212
VehAge3 -0.060 0.110 -0.547 0.5843
VehAge4 -0.624 0.565 -1.106 0.2690
Age2 -1.092 0.478 -2.284 0.0224 *
Age3 -0.976 0.475 -2.055 0.0400 *
Age4 -0.969 0.668 -1.451 0.1468
ν\nu 0.802 0.045 17.910 <<.0001 *
Copula part
θ1\theta_{1} 0.263 0.048 5.509 <<.0001 *
θ2\theta_{2} 0.057 0.071 0.795 0.4266
θ3\theta_{3} 0.409 0.138 2.967 0.0030 *
θ4\theta_{4} 0.445 0.133 3.341 0.0008 *

In terms of understanding the presence of dependence, Table 7 also summarizes estimates of the four copula parameters of dependence as described by θi\theta_{i}, for i=1,2,3,4i=1,2,3,4, and the estimates are not all significantly nonzero at the 5%\% level. For the interpretation of copula parameters in Table 7, one can recall the following meaning of θ1,θ2,θ3\theta_{1},\theta_{2},\theta_{3}, and θ4\theta_{4}:

  • 1.

    θ1\theta_{1}: dependence parameter between the common random effect R{R} and the frequency for every year,

  • 2.

    θ2\theta_{2}: dependence parameter between the common random effect R{R} and each severity for every year,

  • 3.

    θ3\theta_{3}, θ4\theta_{4}: dependence parameters between a frequency and each severity within a year not explained by R{R}.

Thus, according to the estimation results which shows that only θ1\theta_{1} and θ2\theta_{2} are significantly different from zero, we can claim presence of both types of dependence; temporal dependence of claim frequencies and severities as well as dependence between the frequency and severity can be explained by common random effect R{R}. On the other hand, there is weak evidence of dependence between a frequency and each frequency within a year not explained by R{R}.

While the values of θ\theta tells us the relationship between the common random effects and claims, one can directly quantify the magnitude of dependence among the claims by observing the estimated values of ρ\rho’s. According to the model specification in (6), the estimates of dependence parameters, ρ1,ρ2,ρ3,ρ4,\rho_{1},\,\rho_{2},\,\rho_{3},\,\rho_{4}, and ρ5\rho_{5} are calculated from the estimates of θ1,θ2,θ3\theta_{1},\,\theta_{2},\,\theta_{3}, and θ4\theta_{4} as in (6), and their standard errors are obtained using delta method. Table 8 summarizes the derived estimates together with the resulting standard errors of ρ1,ρ2,ρ3,ρ4,\rho_{1},\,\rho_{2},\,\rho_{3},\,\rho_{4}, and ρ5\rho_{5}.

Table 8: Derived estimates and standard errors of ρ\rho’s
parameter est std.error t p-value
ρ1\rho_{1} 0.1968 0.074 2.655 0.0079 *
ρ2\rho_{2} 0.2015 0.119 1.688 0.0915
ρ3\rho_{3} 0.0690 0.025 2.754 0.0059 *
ρ4\rho_{4} 0.0149 0.019 0.780 0.4355
ρ5\rho_{5} 0.0032 0.008 0.398 0.6909

It is interesting to observe that there is now a clearer evidence of all types of dependencies in our multi-year microlevel model. For example, ρ1\rho_{1} describes correlation between a frequency and a severity within a single year, and results provide strong evidence of a positive dependence. The estimate for ρ1\rho_{1} is 0.19680.1968 with a standard error of 0.0740.074, which leads to a very small p-value indicating significantly different from zero. Using the results from (6), despite the non-significance of θ3\theta_{3} and θ4\theta_{4} directly drawn from the estimated model, there is a clear inherent dependence driven by the shared random effect through the interplay with θ1\theta_{1} and θ2\theta_{2}. A similar argument can be said of the other ρ\rho’s.

5.3 Validation

For validation of the proposed model in terms of the individual loss prediction for the 1,472 policyholders in the hold-out sample, we compare the following four models: full model, nested model 1 with θ3=θ4=0\theta_{3}=\theta_{4}=0, the nested model 2 with θ1=θ2=0\theta_{1}=\theta_{2}=0, and the nested model 3 with θ1=θ2=θ3=θ4=0\theta_{1}=\theta_{2}=\theta_{3}=\theta_{4}=0. We measure the quality of prediction as mean squared error (MSE) of average of individual loss prediction over 5,000 Monte Carlo simulations under the estimation result from each model. We also use other measures such as root-mean-square deviation (RMSE), mean absolute error (MAE), and the Gini index in Frees et al., 2011b ; Frees et al., 2014b . For example, MSE for full model is given as

MSE^[full]=11472i=11472(SiS^i[full])2,\widehat{{\rm MSE}}^{\rm[full]}=\frac{1}{1472}\sum_{i=1}^{1472}{\left(S_{i}-\hat{S}_{i}^{\rm[full]}\right)^{2}},

where SiS_{i} is the observed aggregate loss for the ii-th policyholder in 2000, and S^i[full]\hat{S}_{i}^{\rm[full]} is the average of predicted aggregate loss for the ii-th policyholder over 5,000 MC samples from based on the full model. The results are shown in Table 9. In the table, full model shows the best performance in terms of MSE and RMSE, and nested model 1 shows best performance in terms of MAE. On the other hand, nested model 2 shows the best performance in terms of the Gini index.

Table 9: Means squared error
Full Nested 1 Nested 2 Nested 3
RMSE 2445.409 2448.719 2445.519 2448.276
MSE 5980026 5996227 5980564 5994053
MAE 596.87 524.2761 605.0203 530.9766
Gini 28.535 30.478 27.560 29.547

6 Final remarks

This article focuses on the development of a multi-year microlevel collective risk model which accounts for a flexible dependence structure for claim frequencies and claim severities. The common theme in the literature is a framework that regards dependence between claim frequency and the average severity. Our motivating example demonstrates that for these types of dependence models, the copula structure can be constrained. Here, we also show that it is even difficult to arrive at the naive assumption of independence among severities.

In our multi-year microlevel collective risk model, we develop a shared random effects framework that captures various relevant types of dependence between claim frequencies and claim severities over multiple years. The shared random effect parameter induces several forms of dependence; it has similar structure to a one-factor copula model previously studied. Our proposed scheme has the advantages of not only ease of computation but also the capacity to draw intuitive interpretation to the results. Furthermore, it covers other types of dependent frequency and severity models that have previously been studied. One can see that both one-year dependent compound risk model and traditional independent compound risk model are special cases of our proposed model, where θ1=θ2=0\theta_{1}=\theta_{2}=0 and θ1=θ2=θ3=θ4=0\theta_{1}=\theta_{2}=\theta_{3}=\theta_{4}=0, respectively. In the paper, we additionally provide an efficient way to obtain the joint density of multi-year claim required without heavy numerical integration.

We calibrated our proposed with a dataset from a Singapore automobile insurance company, which contains policy characteristics and microlevel claims information for multiple years. The estimation results show us that all five types of correlations considered in a multi-year microlevel collective risk model are statistically significant. We note that the driving force for the dependencies originates from the shared random effect parameter. On top of that, out-of-sample validation results with the proposed model show us that it can be helpful to consider various types of dependence to increase the prediction performances.

Acknowledgements

Rosy Oh was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2019R1A6A1A11051177 and 2020R1I1A1A01067376). Himchan Jeong was supported by James C. Hickman Doctoral Stipend funded by the Society of Actuaries (SOA). Emiliano A. Valdez was supported by CAE Research Grant on Applying Data Mining Techniques in Actuarial Science funded by the Society of Actuaries (SOA). Jae Youn Ahn was supported by a National Research Foundation of Korea (NRF) grant funded by the Korean Government (2020R1F1A1A01061202).

Appendix A Proof of Lemmas 1 and 2

Lemma 1.

If (R,𝐙(𝛕))\left(R,\boldsymbol{Z}_{(\boldsymbol{\tau})}\right) follows (13), then the probability of Φ1(Ft(Nt))\Phi^{-1}(F_{t}(N_{t})) conditional on

(R,Yt,1,,Yt,nt)=(r,yt,1,,yt,nt)\left(R,Y_{t,1},\ldots,Y_{t,n_{t}}\right)=\left(r,y_{t,1},\ldots,y_{t,n_{t}}\right)

is given by

(Ntn|r,yt,1,,yt,nt)=Φ(Φ1(Ft(n))μtσt)\mathbb{P}(N_{t}\leq n|r,y_{t,1},\cdots,y_{t,n_{t}})=\Phi\left(\frac{\Phi^{-1}(F_{t}(n))-\mu_{t}}{\sigma_{t}}\right)

where

μt=(θ1,ρ1,,ρ1)(𝚺(tt)(𝝆))1(r,Φ1(G(yt,1)),,Φ1(G(yt,nt)))T\mu_{t}=\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)\left(\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})}\right)^{-1}\left(r,\Phi^{-1}\left(G(y_{t,1})\right),\cdots,\Phi^{-1}\left(G(y_{t,n_{t}})\right)\right)^{\mathrm{T}}

and

(σt)2=1(θ1,ρ1,,ρ1)(𝚺(tt)(𝝆))1(θ1,ρ1,,ρ1)T\left(\sigma_{t}\right)^{2}=1-\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)\left(\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})}\right)^{-1}\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)^{\mathrm{T}}

with ρ1=θ2,ρ2=ρ2\rho_{1}^{*}=\theta_{2},\,\rho_{2}^{*}=\rho_{2}.

Proof.

By the inherited property of the Gaussian copula, it is easy to see that

(Φ1(Ft(Nt))x0,Rr,Φ1(Gt,1(Yt,1))x1,,Φ1(Gt,nt(Yt,nt))xnt)\displaystyle{\mathbb{P}}\left(\Phi^{-1}(F_{t}(N_{t}))\leq x_{0},R\leq r,\Phi^{-1}(G_{t,1}(Y_{t,1}))\leq x_{1},\cdots,\Phi^{-1}(G_{t,n_{t}}(Y_{t,n_{t}}))\leq x_{n_{t}}\right) (A.1)
=Φ𝟎nt+2,𝚺t(x0,r,x1,,xnt)\displaystyle=\Phi_{{\bf 0}_{n_{t}+2},\boldsymbol{\Sigma}_{t}}\left(x_{0},r,x_{1},\cdots,x_{n_{t}}\right)

where

[Σt]m={1,=m;θ1,+m=3;ρ1,1=,m>2 or >2,m=1;θ2,2=<m or >m=2;ρ2,elsewhere;=(1θ1ρ1ρ1\hdashlineθ11θ2θ2\hdashlineρ1θ21ρ2ρ2ρ21ρ2ρ1θ2ρ2ρ21)[\Sigma_{t}]_{\ell m}=\begin{cases}1,&\ell=m;\\ \theta_{1},&\ell+m=3;\\ \rho_{1},&1=\ell,m>2\text{ or }\ell>2,m=1;\\ \theta_{2},&2=\ell<m\text{ or }\ell>m=2;\\ \rho_{2},&\hbox{elsewhere};\end{cases}=\left(\begin{array}[]{c:c:cccc}1&\theta_{1}&\rho_{1}&\cdots&\cdots&\rho_{1}\\ \hdashline\theta_{1}&1&\theta_{2}&\cdots&\cdots&\theta_{2}\\ \hdashline\rho_{1}&\theta_{2}&1&\rho_{2}&\cdots&\rho_{2}\\ \vdots&\vdots&\rho_{2}&\ddots&\ddots&\vdots\\ \vdots&\vdots&\vdots&\ddots&1&\rho_{2}\\ \rho_{1}&\theta_{2}&\rho_{2}&\cdots&\rho_{2}&1\\ \end{array}\right)

Furthermore, (A.1) implies

(Φ1(Ft(Nt))x0|R=r,Yt,1=yt,1,,Yt,nt=yt,nt)=Φ(x0μtσt){\mathbb{P}}\left(\Phi^{-1}(F_{t}(N_{t}))\leq x_{0}|R=r,Y_{t,1}=y_{t,1},\ldots,Y_{t,n_{t}}=y_{t,n_{t}}\right)=\Phi\left(\frac{x_{0}-\mu_{t}}{\sigma_{t}}\right)

where

μt=(θ1,ρ1,,ρ1)(𝚺(tt)(𝝆))1(r,Φ1(G(yt,1)),,Φ1(G(yt,nt)))T\mu_{t}=\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)\left(\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})}\right)^{-1}\left(r,\Phi^{-1}\left(G(y_{t,1})\right),\cdots,\Phi^{-1}\left(G(y_{t,n_{t}})\right)\right)^{\mathrm{T}}

and

(σt)2=1(θ1,ρ1,,ρ1)(𝚺(tt)(𝝆))1(θ1,ρ1,,ρ1)T\left(\sigma_{t}\right)^{2}=1-\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)\left(\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})}\right)^{-1}\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)^{\mathrm{T}}

with ρ1=θ2,ρ2=ρ2\rho_{1}^{*}=\theta_{2},\,\rho_{2}^{*}=\rho_{2}.

Lemma 2.

Consider the settings in (13). Then, the density function of (R,𝐙t)\left(R,\boldsymbol{Z}_{t}\right) conditional on R=rR=r is given by

gt[joint](yt,1,,yt,nt|r)=ϕ𝝁,𝚺(Φ1(G(yt,1)),,Φ1(G(yt,nt)))j=1ntg(yt,j)ϕ(Φ1(G(yt,j))),g^{\rm[joint]*}_{t}\left(y_{t,1},\cdots,y_{t,n_{t}}|r\right)=\phi_{\boldsymbol{\mu^{*}},\boldsymbol{\Sigma}^{*}}\left(\Phi^{-1}\left(G\left(y_{t,1}\right)\right),\cdots,\Phi^{-1}\left(G\left(y_{t,n_{t}}\right)\right)\right)\prod\limits_{j=1}^{n_{t}}\frac{g(y_{t,j})}{\phi\left(\Phi^{-1}\left(G(y_{t,j})\right)\right)},

where

μ=rθ2𝟏ntand𝚺=(1ρ2)𝑰nt+(ρ2θ22)𝑱nt×nt.\mu^{*}=r\,\theta_{2}\boldsymbol{1}_{n_{t}}\quad\hbox{and}\quad\boldsymbol{\Sigma}^{*}=(1-\rho_{2})\boldsymbol{I}_{n_{t}}+(\rho_{2}-\theta_{2}^{2})\boldsymbol{J}_{n_{t}\times n_{t}}.
Proof.

By the inherited property of the Gaussian copula, it is easy to see that

(R,Yt,1,,Yt,nt)C(tt)(𝝆)(Φ,Gt,1,,Gt,nt)\left(R,Y_{t,1},\ldots,Y_{t,n_{t}}\right)\sim C_{(tt)}^{(\boldsymbol{\rho}^{*})}\left(\Phi,G_{t,1},\cdots,G_{t,n_{t}}\right)

with ρ1=θ2,ρ2=ρ2\rho_{1}^{*}=\theta_{2},\,\rho_{2}^{*}=\rho_{2}, where C(tt)(𝝆)C_{(tt)}^{(\boldsymbol{\rho}^{*})} is a Gaussian copula with correlation matrix 𝚺(tt)(𝝆)\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})}. As a result, we have that, conditional on R=rR=r, the random vector Φ1(Gt,1(yt,1)),,Φ1(Gt,nt(yt,nt))\Phi^{-1}(G_{t,1}(y_{t,1})),\cdots,\Phi^{-1}(G_{t,n_{t}}(y_{t,n_{t}})) follows a multivariate normal distribution, with mean vector given as

(θ2,,θ2)TI11r=rθ2𝟏nt,(\theta_{2},\ldots,\theta_{2})^{\mathrm{T}}\cdot{I_{1}}^{-1}\cdot r=r\theta_{2}\boldsymbol{1}_{n_{t}},

and covariance matrix given as

(1ρ2)𝑰nt+ρ2𝑱nt×nt(θ2,,θ2)(θ2,,θ2)T=(1ρ2)𝑰nt+(ρ2θ22)𝑱nt×nt.(1-\rho_{2})\boldsymbol{I}_{n_{t}}+\rho_{2}\boldsymbol{J}_{n_{t}\times n_{t}}-(\theta_{2},\ldots,\theta_{2})\cdot(\theta_{2},\ldots,\theta_{2})^{\mathrm{T}}=(1-\rho_{2})\boldsymbol{I}_{n_{t}}+(\rho_{2}-\theta_{2}^{2})\boldsymbol{J}_{n_{t}\times n_{t}}.

Therefore, we have

(Yt,1yt,1,,Yt,ntyt,nt|r)=Φμ,𝚺(Φ1(G(yt,1)),,Φ1(G(yt,nt))){\mathbb{P}}\left(Y_{t,1}\leq y_{t,1},\cdots,Y_{t,n_{t}}\leq y_{t,n_{t}}|r\right)=\Phi_{\mu^{*},\boldsymbol{\Sigma}^{*}}\left(\Phi^{-1}\left(G\left(y_{t,1}\right)\right),\cdots,\Phi^{-1}\left(G\left(y_{t,n_{t}}\right)\right)\right)

with the following corresponding density function

gt(yt,1,,yt,nt|r)=ϕμ,𝚺(Φ1(G(yt,1)),,Φ1(G(yt,nt)))j=1ntg(yt,j)ϕ(Φ1(G(yt,j))).g^{*}_{t}\left(y_{t,1},\cdots,y_{t,n_{t}}|r\right)=\phi_{\mu^{*},\boldsymbol{\Sigma}^{*}}\left(\Phi^{-1}\left(G\left(y_{t,1}\right)\right),\cdots,\Phi^{-1}\left(G\left(y_{t,n_{t}}\right)\right)\right)\prod\limits_{j=1}^{n_{t}}\frac{g(y_{t,j})}{\phi\left(\Phi^{-1}\left(G(y_{t,j})\right)\right)}.

Appendix B Multi-year microlevel collective risk model with tt copulas

The microlevel collective risk model with the Gaussian copula in Model 2 can be naturally extended to tt copula based model as the following model shows. Here, we use FtF_{t}, GtG_{t}, Gt,jG_{t,j} considered in Section 3.2, and assume Gt,j=GtG_{t,j}=G_{t} for any t,jt,j\in\mathbb{N} for simplicity.

Model 3 (The tt copula model for the multi-year microlevel collective risk model).

Suppose 𝛒\boldsymbol{\rho} satisfies (6). Then, consider the random vector 𝐙t\boldsymbol{Z}_{t} whose joint distribution function H(𝐳τ)H(\boldsymbol{z}_{\tau}) is given by the following copula model representation

H(𝒛(τ))=C(𝒏)ν,(𝝆)(F1(n1),G1,1(y1,1),,G1,n1(y1,n1),Fτ(nτ),Gτ,1(yτ,1),,Gτ,nτ(yτ,nτ))H(\boldsymbol{z}_{(\tau)})=C_{(\boldsymbol{n})}^{{\nu},(\boldsymbol{\rho})}\left(F_{1}(n_{1}),G_{1,1}(y_{1,1}),\cdots,G_{1,n_{1}}(y_{1,n_{1}}),\cdots F_{\tau}(n_{\tau}),G_{\tau,1}(y_{\tau,1}),\cdots,G_{\tau,n_{\tau}}(y_{\tau,n_{\tau}})\right) (B.1)

where C(𝐧)ν,𝛒C_{(\boldsymbol{n})}^{{\nu},\boldsymbol{\rho}} is a tt copula with scale matrix 𝚺(𝐧)(𝛒)\boldsymbol{\Sigma}_{(\boldsymbol{n})}^{(\boldsymbol{\rho})} and degree of freedom ν\nu.

Following the similar idea in Section 3.2, we have the following result.

Corollary 2.

Consider the random vector 𝐙(𝛕)\boldsymbol{Z}_{(\boldsymbol{\tau})} whose distribution function is defined in (B.1). Then, we have the following density function of 𝐙(𝛕)\boldsymbol{Z}_{(\boldsymbol{\tau})} at 𝐙(𝛕)=𝐳(𝛕)\boldsymbol{Z}_{(\boldsymbol{\tau})}=\boldsymbol{z}_{(\boldsymbol{\tau})}

h(𝒛(𝝉))\displaystyle h(\boldsymbol{z}_{(\boldsymbol{\tau})}) (B.2)
=t=1τ[gt|R,W(yt,1,,yt,nt|r,w)\displaystyle=\int\int\prod\limits_{t=1}^{\tau}\Bigg{[}g_{t|R,W}^{*}\left(y_{t,1},\cdots,y_{t,n_{t}}|r,w\right)
(Φ(Φ1(F(nt))μtσt)Φ(Φ1(F(nt1))μtσt))]ϕ(r)νfν[Chi](wν)drdw\displaystyle\quad\quad\left(\Phi\left(\frac{\Phi^{-1}(F(n_{t}))-\mu_{t}}{\sigma_{t}}\right)-\Phi\left(\frac{\Phi^{-1}(F(n_{t}-1))-\mu_{t}}{\sigma_{t}}\right)\right)\Bigg{]}\phi(r){\nu}f_{\nu}^{\rm[Chi]}(w\,{\nu})\,{\rm d}{r}{\rm d}{w}

where fν[Chi]f_{\nu}^{\rm[Chi]} is a density function of chi-squared distribution with ν{\nu} degrees of freedom, and

μt=(θ1,ρ1,,ρ1)(𝚺(tt)(𝝆))1(r,Φ1(G(yt,1)),,Φ1(G(yt,nt)))T\mu_{t}=\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)\left(\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})}\right)^{-1}\left(r,\Phi^{-1}\left(G(y_{t,1})\right),\cdots,\Phi^{-1}\left(G(y_{t,n_{t}})\right)\right)^{\mathrm{T}}

and

(σt)2=1(θ1,ρ1,,ρ1)(𝚺(tt)(𝝆))1(θ1,ρ1,,ρ1)Tw\left(\sigma_{t}\right)^{2}=\frac{1-\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)\left(\boldsymbol{\Sigma}_{(tt)}^{(\boldsymbol{\rho}^{*})}\right)^{-1}\left(\theta_{1},\rho_{1},\cdots,\rho_{1}\right)^{\mathrm{T}}}{w}

with ρ1=θ2\rho_{1}^{*}=\theta_{2} and ρ2=ρ2\rho_{2}^{*}=\rho_{2}. Here, gt|R,W(|r)g_{t|R,W}^{*}\left(\cdot|r\right) is the density function of 𝐘t\boldsymbol{Y}_{t} conditional on R=rR=r and W=wW=w, and given by

gt|R,W(yt,1,,yt,nt|r,w)\displaystyle g_{t|R,W}^{*}\left(y_{t,1},\cdots,y_{t,n_{t}}|r,w\right)
=ϕ𝝁,𝚺(Φ1(G(yt,1)),,Φ1(G(yt,nt)))j=1ntg(yt,j)ϕ(Φ1(G(yt,j)))\displaystyle=\phi_{\boldsymbol{\mu^{*}},\boldsymbol{\Sigma}^{*}}\left(\Phi^{-1}\left(G\left(y_{t,1}\right)\right),\cdots,\Phi^{-1}\left(G\left(y_{t,n_{t}}\right)\right)\right)\prod\limits_{j=1}^{n_{t}}\frac{g(y_{t,j})}{\phi\left(\Phi^{-1}\left(G(y_{t,j})\right)\right)}

where

𝝁=rθ2𝟏ntand𝚺=(1ρ2)𝑰nt+(ρ2θ22)𝑱nt×ntw.\boldsymbol{\mu^{*}}=r\,\theta_{2}\boldsymbol{1}_{n_{t}}\quad\hbox{and}\quad\boldsymbol{\Sigma}^{*}=\frac{(1-\rho_{2})\boldsymbol{I}_{n_{t}}+(\rho_{2}-\theta_{2}^{2})\boldsymbol{J}_{n_{t}\times n_{t}}}{w}.
Proof of Corollary 2.

Knowing that multivariate tt-distribution with the degree of freedom ν\nu can be represented as a multivariate normal distribution conditional on the latent variable W=wW=w whose density function at W=wW=w is given by νfν[Chi](wν){\nu}f_{\nu}^{\rm[Chi]}(w\,{\nu}), the proof follows immediately from Corollary 1. ∎

Note that Corollary 1 is a special case of Corollary 2 when ν=\nu=\infty.

Appendix C Mathematical Justification of Model 1

As briefly discussed in Remark 1, Model 1 is not casual in the sense that the length or dimension of the observation varies depending on the value of the observation. One solution to detour the difficulty from the varying dimension of the observation is that we may assume the infinite number of severities regardless of the value of the frequency ntn_{t}. Specifically, we define

𝒁t(𝒌t):=(N1,𝒀1(k1),,Nt,𝒀t(kt))\boldsymbol{Z}_{t}(\boldsymbol{k}_{t}):=\left(N_{1},\boldsymbol{Y}_{1}(k_{1}),\cdots,N_{t},\boldsymbol{Y}_{t}(k_{t})\right)

for any 𝒌t:=(k1,,kt)0t\boldsymbol{k}_{t}:=(k_{1},\cdots,k_{t})\in\mathbb{N}_{0}^{t}. Then, 𝒚t=𝒚t(nt)\boldsymbol{y}_{t}=\boldsymbol{y}_{t}(n_{t}) can be understood as the observation where we only observe first ntn_{t} severities among the infinite number of severities. Then, Model 1 can be reformulated as follows.

Model 4 (Revision of Model 1).

We repeatedly define the following random effect model for all possible values of

kt0,t=1,k_{t}\in\mathbb{N}_{0},\quad t=1,\cdots (C.1)

where the joint distribution between observations and the random effect model is presented with the copula model with parts i and iv are the same as in Model 1.

  1. ii.

    Conditional on R=rR=r, we have that 𝒁t(kt)\boldsymbol{Z}_{t}(k_{t}) for t=1,t=1,\cdots are independent observations whose distribution function is given by

    Ht#(𝒛t(kt)|r):=C(θ3,θ4)(Ft(nt|r),Gt(yt,1|r),,Gt(yt,kt|r)).H_{t}^{\#}\left(\boldsymbol{z}_{t}(k_{t})|r\right):=C_{(\theta_{3},\theta_{4})}\left(F_{t}(n_{t}|r),G_{t}(y_{t,1}|r),\ldots,G_{t}(y_{t,k_{t}}|r)\right).

    As a result, we have the following distribution function of 𝒁(τ)(𝒌τ)\boldsymbol{Z}_{(\tau)}(\boldsymbol{k}_{\tau})

    H#(𝒛(τ)(𝒌τ)):=t=1τHt#(𝒛t(kt)|r)π(r)dr.H^{\#}(\boldsymbol{z}_{(\tau)}(\boldsymbol{k}_{\tau})):=\int\prod\limits_{t=1}^{\tau}H_{t}^{\#}\left(\boldsymbol{z}_{t}(k_{t})|r\right)\pi(r){\rm d}r.
  2. iii.

    The parameters θ3\theta_{3} and θ4\theta_{4} of the copula C(θ3,θ4)C_{(\theta_{3},\theta_{4})} controls the independence between the frequency and severities and independence among individual severities, respectively, within a year so that we have

    ht#(𝒛t(kt)|r)=ft(nt|r)gt[joint]#(𝒚t(kt)|r) if and only if θ3=0,{h_{t}^{\#}{(\boldsymbol{z}_{t}(k_{t})|r)}}=f_{t}(n_{t}|r)g_{t}^{\rm[joint]\#}(\boldsymbol{y}_{t}(k_{t})|r)\ \text{ if and only if }\ \theta_{3}=0,

    where

    gt[joint]#(𝒚t(kt)|r)=j=1ktgt(yt,j|r) if and only if θ4=0,g_{t}^{\rm[joint]\#}{(\boldsymbol{y}_{t}(k_{t})|r)}=\prod\limits_{j=1}^{k_{t}}g_{t}{(y_{t,j}|r)}\ \text{ if and only if }\ \theta_{4}=0,

    where gt[joint]#g_{t}^{\rm[joint]\#} means joint density function of 𝒀t(kt)\boldsymbol{Y}_{t}(k_{t}).

  3. v.

    𝒚t(kt)R\boldsymbol{y}_{t}(k_{t})\perp R\, for all t=1,,τt=1,\ldots,\tau if and only if θ2=0\theta_{2}=0.

  4. vi.

    (Inheritance Property) Consider two distribution functions

    H[1](𝒛t(kt)|r):=Ht#(𝒛t(kt)|r)andH[2](𝒛t(kt)|r):=Ht#(𝒛t(kt)|r)H^{[1]}\left(\boldsymbol{z}_{t}(k_{t})|r\right):=H_{t}^{\#}\left(\boldsymbol{z}_{t}(k_{t})|r\right)\quad\hbox{and}\quad H^{[2]}\left(\boldsymbol{z}_{t}(k_{t}^{*})|r\right):=H_{t}^{\#}\left(\boldsymbol{z}_{t}(k_{t}^{*})|r\right)

    for ktktk_{t}\leq k_{t}^{*}. Then, we have the following inheritance property

    H[1](nt,yt,1,,yt,kt|r)=limyt,kt+1,,yt,ktH[2](nt,yt,1,,yt,kt,yt,kt+1,,yt,kt|r)H^{[1]}\left(n_{t},y_{t,1},\cdots,y_{t,k_{t}}|r\right)=\lim\limits_{y_{t,k_{t}+1}\rightarrow\infty,\cdots,y_{t,k_{t}^{*}}\rightarrow\infty}H^{[2]}\left(n_{t},y_{t,1},\cdots,y_{t,k_{t}},y_{t,k_{t}+1},\cdots,y_{t,k_{t}^{*}}|r\right) (C.2)

    for any 𝒛t(kt)=(nt,yt,1,,yt,kt)\boldsymbol{z}_{t}(k_{t})=\left(n_{t},y_{t,1},\cdots,y_{t,k_{t}}\right) and rr.

Note that part v in Model 4 is necessary for the well-definedness of the model since the model is repeatedly defined for multiple times for (C.1). One immediate result from Model 4 is that its density function at 𝒁t(kt)=𝒛t(kt)\boldsymbol{Z}_{t}(k_{t})=\boldsymbol{z}_{t}(k_{t})

ht#(𝒛t(kt))=ht#(nt,𝒚t(kt)|r)π(r)drh_{t}^{\#}(\boldsymbol{z}_{t}(k_{t}))=\int h_{t}^{\#}{(n_{t},\boldsymbol{y}_{t}(k_{t})|r)}\pi(r){\rm d}{r}

is well-defined under the classical multivariate analysis with the following relation with the corresponding joint distribution function

Ht#(𝒛t(kt))=x0=0ntyt,1yt,ktht#(x0,x1,,xkt)dx1dxkt.H_{t}^{\#}(\boldsymbol{z}_{t}(k_{t}))=\sum\limits_{x_{0}=0}^{n_{t}}\int_{-\infty}^{y_{t,1}}\cdots\int_{-\infty}^{y_{t,k_{t}}}h_{t}^{\#}(x_{0},x_{1},\cdots,x_{k_{t}}){\rm d}x_{1}\cdots{\rm d}x_{k_{t}}.

Furthermore, importantly, we observe that the density function ht#h_{t}^{\#} at 𝒁t(kt)=(nt,yt,1,,yt,kt)\boldsymbol{Z}_{t}(k_{t})=\left(n_{t},y_{t,1},\cdots,y_{t,k_{t}}\right) in Model 4 coincides with the density function hth_{t} in Model 1 at 𝒁t=(nt,yt,1,,yt,nt)\boldsymbol{Z}_{t}=\left(n_{t},y_{t,1},\cdots,y_{t,n_{t}}\right) if kt=ntk_{t}=n_{t}. Hence, as long as inheritance property in part vi of Model 4 holds, we can see that the density function and the corresponding distribution function in Model 1 is well-defined having Model 4 as background model. For the simplicity of the presentation, this paper only present Model 1 without specifying the background model in Model 4.


References

  • Baumgartner et al., (2015) Baumgartner, C., Gruber, L. F., and Czado, C. (2015). Bayesian total loss estimation using shared random effects. Insurance: Mathematics and Economics, 62:194–201.
  • Cossette et al., (2019) Cossette, H., Marceau, E., and Mtalai, I. (2019). Collective risk models with dependence. Insurance: Mathematics and Economics, 87:153–168.
  • Czado et al., (2012) Czado, C., Kastenmeier, R., Brechmann, E. C., and Min, A. (2012). A mixed copula model for insurance claims and claim sizes. Scandinavian Actuarial Journal, 2012(4):278–305.
  • Denuit et al., (2007) Denuit, M., Maréchal, X., Pitrebois, S., and Walhin, J.-F. (2007). Actuarial modelling of claim counts: Risk classification, credibility and bonus-malus systems. John Wiley & Sons.
  • (5) Frees, E. W., Derrig, R. A., and Meyers, G. (2014a). Predictive Modeling Applications in Actuarial Science, volume 1. Cambridge University Press.
  • (6) Frees, E. W., Gao, J., and Rosenberg, M. A. (2011a). Predicting the frequency and amount of health care expenditures. North American Actuarial Journal, 15(3):377–392.
  • Frees et al., (2016) Frees, E. W., Lee, G., and Yang, L. (2016). Multivariate frequency-severity regression models in insurance. Risks, 4(1):4.
  • (8) Frees, E. W., Meyers, G., and Cummings, A. D. (2011b). Summarizing insurance scores using a gini index. Journal of the American Statistical Association, 106(495):1085–1098.
  • (9) Frees, E. W., Meyers, G., and Cummings, A. D. (2014b). Insurance ratemaking and a gini index. Journal of Risk and Insurance, 81(2):335–366.
  • Garrido et al., (2016) Garrido, J., Genest, C., and Schulz, J. (2016). Generalized linear models for dependent frequency and severity of insurance claims. Insurance: Mathematics and Economics, 70:205 – 215.
  • Genest and Nešlehová, (2007) Genest, C. and Nešlehová, J. (2007). A primer on copulas for count data. ASTIN Bulletin: The Journal of the IAA, 37(2):475–515.
  • Genz and Bretz, (2009) Genz, A. and Bretz, F. (2009). Computation of multivariate normal and t probabilities, volume 195. Springer Science & Business Media.
  • Hernández-Bastida et al., (2009) Hernández-Bastida, A., Fernández-Sánchez, M., and Gómez-Déniz, E. (2009). The net bayes premium with dependence between the risk profiles. Insurance: Mathematics and Economics, 45(2):247–254.
  • Jeong and Valdez, (2020) Jeong, H. and Valdez, E. A. (2020). Predictive compound risk models with dependence. SSRN, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3494575.
  • Kadhem and Nikoloulopoulos, (2019) Kadhem, S. H. and Nikoloulopoulos, A. K. (2019). Factor copula models for mixed data. arXiv preprint arXiv:1907.07395.
  • Klugman et al., (2012) Klugman, S. A., Panjer, H. H., and Willmot, G. E. (2012). Loss models: from data to decisions, volume 715. John Wiley & Sons.
  • Krämer et al., (2013) Krämer, N., Brechmann, E. C., Silvestrini, D., and Czado, C. (2013). Total loss estimation using copula-based regression models. Insurance: Mathematics and Economics, 53(3):829–839.
  • Krupskii and Joe, (2013) Krupskii, P. and Joe, H. (2013). Factor copula models for multivariate data. Journal of Multivariate Analysis, 120:85–101.
  • Krupskii and Joe, (2015) Krupskii, P. and Joe, H. (2015). Structured factor copula models: Theory, inference and computation. Journal of Multivariate Analysis, 138:53–73.
  • Landsman and Valdez, (2003) Landsman, Z. M. and Valdez, E. A. (2003). Tail conditional expectations for elliptical distributions. North American Actuarial Journal, 7(4):55–71.
  • Lee and Shi, (2019) Lee, G. Y. and Shi, P. (2019). A dependent frequency–severity approach to modeling longitudinal insurance claims. Insurance: Mathematics and Economics, 87:115–129.
  • Lee et al., (2020) Lee, W., Kim, J., and Ahn, J. Y. (2020). The poisson random effect model for experience ratemaking: Limitations and alternative solutions. Insurance: Mathematics and Economics, 91:26–36.
  • Murray et al., (2013) Murray, J. S., Dunson, D. B., Carin, L., and Lucas, J. E. (2013). Bayesian gaussian copula factor models for mixed data. Journal of the American Statistical Association, 108(502):656–665.
  • Nikoloulopoulos and Joe, (2015) Nikoloulopoulos, A. K. and Joe, H. (2015). Factor copula models for item response data. Psychometrika, 80(1):126–150.
  • (25) Oh, R., Ahn, J. Y., and Lee, W. (2020a). On copula-based collective risk models: from elliptical copulas to vine copulas. Scandinavian Actuarial Journal, In Press.
  • (26) Oh, R., Shi, P., and Ahn, J. Y. (2020b). Bonus-malus premiums under the dependent frequency-severity modeling. Scandinavian Actuarial Journal, 2020(3):172–195.
  • Park et al., (2018) Park, S. C., Kim, J. H., and Ahn, J. Y. (2018). Does hunger for bonuses drive the dependence between claim frequency and severity? Insurance: Mathematics and Economics, 83:32–46.
  • Shi et al., (2015) Shi, P., Feng, X., and Ivantsova, A. (2015). Dependent frequency–severity modeling of insurance claims. Insurance: Mathematics and Economics, 64:417–428.
  • Shi and Yang, (2018) Shi, P. and Yang, L. (2018). Pair copula constructions for insurance experience rating. Journal of the American Statistical Association, 113(521):122–133.
  • Yang et al., (2019) Yang, L., Frees, E. W., and Zhang, Z. (2019). Nonparametric estimation of copula regression models with discrete outcomes. Journal of the American Statistical Association, pages 1–25.