Mack's model is a Glm !

Aug 18, 2018 9 min read R, Non-life Reserving

Which actuary does not know about Mack’s model ? Due to Mack (1991), this model is fairly simple. Suppose you have a triangle.

Ok seeing the origin dates of claims, thoose data are old. But who cares ? Let’s denote \(C_{i,j}\) the value of this triangle from the \(i\)’th row and the \(j\)’th collumn.

The mack model consist of a stochastic model that enframe the classical chain-ladder estimator, denoted here by the vector \(\mathbf{\hat{f}}\), defined \(\forall j \in \{1,...,n\}\) by :

\[\hat{f}_j = \frac{\sum\limits_{i=1}^{n-j+1} C_{i,j+1}}{\sum\limits_{i=1}^{n-j+1} C_{i,j}}\]

Then we estimate future claims by setting, \(\forall i,j \,|\, i+j>n\), \(C_{i,j+1} = \hat{f_j} C_{i,j}\). That’s chain-ladder estimation right ?

Mack then found stochastic hypohtesis that make thoose estimator better in a certain sense (in the MSE-sense) :

\((H1)\) : The lines are independants
\((H2)\) : \(\exists\, \mathbf{f} \text{ such that } \forall i,j,\, \mathbb{E}(C_{i,j+1}\,|\, C_{i,1},...,C_{i,j}) = f_j C_{i,j}\)
\((H3)\) : \(\exists\, \mathbf{\sigma} \text{ such that } \forall i,j,\, \mathbb{V}(C_{i,j+1}\,|\, C_{i,1},...,C_{i,j}) = \sigma_j^2 C_{i,j}\)

First, lets look a little closer at theese hypothesis. First, we should note the conditionning of \(C_{i,j+1}\) on the past of the origin year.

Suppose that, following England and Verrall (2002), we reformulate the model in terme of individual developpements factors. Set :

\[F_{i,j} = \frac{C_{i,j+1}}{C_{i,j}} \text{ and } w_{i,j} = C_{i,j}\]

The 2 last hypohtesis can then be reformulated in the following way (just dividing by \(w_{i,j}\)).

\((H2*)\) : \(\exists\, \mathbf{f} \text{ such that } \forall i,j,\, \mathbb{E}(F_{i,j}\,|\, w_{i,j}) = f_j\)
\((H3*)\) : \(\exists\, \mathbf{\sigma} \text{ such that } \forall i,j,\, \mathbb{V}(F_{i,j}\,|\, w_{i,j}) = \frac{\sigma_j^2}{w_{i,j}}\)

Note that i assume that the information given by the past is well sumarised by \(w_ {i,j}\), which is indeed the case under those hypohtesis.

Now that we reformulate the hypohtesis, you’l say “What about the GLM you promised ?”. Well i’m comming to it.

Look again at the formula for the estimator of the chain-ladder developpement factors. DO you see the weighted mean of \(F\)’s, weigthed by \(C\)’s ? Well you should ! Indeed, since we posed \(w_{i,j} = C_{i,j}\),

\[\hat{f}_j = \frac{\sum\limits_{i=1}^{n-j+1} F_{i,j}\, w_{i,j}}{\sum\limits_{i=1}^{n-j+1} w_{i,j}}\]

Now let’s focuse on the mack estimators for the variance parameters \(\mathbf{\sigma}\) :

\[\hat{\sigma}_j = \frac{1}{n-j} \sum\limits_{i=1}^{n-j+1} w_{i,j} (F_{i,j} - \hat{f}_j)^2\]

they look a lot like a simple unbiaised variance estimation, for the estimation of \(F\)’s by \(\hat{f}\)’s, with case weights \(w\)’s, right ? Well, although the normalisation is not right (should be \(\frac{1}{n-j-1}\)), they do.

Ok now that i gave you some hints, let’s get on it. To estimate the first ordre, i.e \(f\)’s, we just take a weighted average of observation \(F\)’s, with weights \(w\)’s, by column.

So if i set a glm with :

Link function Identity
Any law in the exponential family
case weights given by \(w\)
response variable \(F\)
Only covariable the indicant f the column, i.e the year of developpement

I should get the same estimate, right ? Let’s try on the ABC triangle from the ChainLadder package (see Gesmann (2009) ) :

Table 1: Incured triangle – cumulative view
	1	2	3	4	5	6	7	8	9	10	11
1977	153638	342050	476584	564040	624388	666792	698030	719282	735904	750344	762544
1978	178536	404948	563842	668528	739976	787966	823542	848360	871022	889022
1979	210172	469340	657728	780802	864182	920268	958764	992532	1019932
1980	211448	464930	648300	779340	858334	918566	964134	1002134
1981	219810	486114	680764	800862	888444	951194	1002194
1982	205654	458400	635906	765428	862214	944614
1983	197716	453124	647772	790100	895700
1984	239784	569026	833828	1024228
1985	326304	798048	1173448
1986	420778	1011178
1987	496200

First, the estimated developpement factor given by standard chain-ladder :

ChainLadder::MackChainLadder(ABC)$f
##  [1] 2.308599 1.421098 1.199934 1.113445 1.072736 1.047559 1.034211 1.026047
##  [9] 1.020188 1.016259 1.000000

If we create a data.frame with what we need and then try a useless glm (actualy it’s even a lm there) on it :

df <- ABC %>% 
    as.data.frame %>% 
    mutate(
      weight = cbind(rep(NA,nrow(ABC)),ABC %>% as.matrix) %>% .[,1:nrow(ABC)] %>% as.vector,
      F = value / weight,
     ) %>% 
    drop_na

model <- lm(data =df,formula=F~factor(dev)+0,weights=weight)
  
df %<>% 
  mutate(f.hat = model$fitted.values,
         rez2 = weight * (F - f.hat)^2)

I also extracted the squared preason residuals.

Indeed, if we want to model \(\sigma\)’s too, we need a second model to calculate the weighted average of… Squared pearson residuals ! Which is Normla in the context of Joint modeling of Glm’s, see e.g McCullagh and Nelder (1989) .

To calculate the weighted average of squared residuals (for the dispersions parameters), we’ll just use another linear model since only the first order is here of interest. Note that we excluse the last column of the triangle, since there is only one point (and therefore no estimation of variance is possible). We also add the last valeu estimated by Mack for sake of completeness.

model2 <- df %>% 
    filter(dev != nrow(ABC)) %>%
    {lm(data =., formula = rez2 ~ factor(dev))}

df %<>% mutate(
    estimated.phi = model2$fitted.values %>% {c(.,min(.[length(.)],.[length(.)-1],.[length(.)]^2/.[length(.)-1]))}
  )

df
##    origin dev   value weight        F    f.hat         rez2 estimated.phi
## 1    1977   2  342050 153638 2.226337 2.308599 1039.6609948  1940.0408948
## 2    1978   2  404948 178536 2.268159 2.308599  291.9754480  1940.0408948
## 3    1979   2  469340 210172 2.233123 2.308599 1197.2515446  1940.0408948
## 4    1980   2  464930 211448 2.198791 2.308599 2549.5751391  1940.0408948
## 5    1981   2  486114 219810 2.211519 2.308599 2071.5915573  1940.0408948
## 6    1982   2  458400 205654 2.228987 2.308599 1303.4551427  1940.0408948
## 7    1983   2  453124 197716 2.291792 2.308599   55.8462412  1940.0408948
## 8    1984   2  569026 239784 2.373077 2.308599  996.9031237  1940.0408948
## 9    1985   2  798048 326304 2.445719 2.308599 6135.1878014  1940.0408948
## 10   1986   2 1011178 420778 2.403115 2.308599 3758.9619552  1940.0408948
## 11   1977   3  476584 342050 1.393317 1.421098  263.9876431   548.0174476
## 12   1978   3  563842 404948 1.392381 1.421098  333.9344808   548.0174476
## 13   1979   3  657728 469340 1.401389 1.421098  182.3038561   548.0174476
## 14   1980   3  648300 464930 1.394403 1.421098  331.3012598   548.0174476
## 15   1981   3  680764 486114 1.400420 1.421098  207.8370561   548.0174476
## 16   1982   3  635906 458400 1.387229 1.421098  525.8105220   548.0174476
## 17   1983   3  647772 453124 1.429569 1.421098   32.5170215   548.0174476
## 18   1984   3  833828 569026 1.465360 1.421098 1114.8127986   548.0174476
## 19   1985   3 1173448 798048 1.470398 1.421098 1939.6523906   548.0174476
## 20   1977   4  564040 476584 1.183506 1.199934  128.6170742   208.3223889
## 21   1978   4  668528 563842 1.185665 1.199934  114.7889365   208.3223889
## 22   1979   4  780802 657728 1.187120 1.199934  107.9955483   208.3223889
## 23   1980   4  779340 648300 1.202129 1.199934    3.1232034   208.3223889
## 24   1981   4  800862 680764 1.176416 1.199934  376.5043910   208.3223889
## 25   1982   4  765428 635906 1.203681 1.199934    8.9295071   208.3223889
## 26   1983   4  790100 647772 1.219719 1.199934  253.5814014   208.3223889
## 27   1984   4 1024228 833828 1.228344 1.199934  673.0390490   208.3223889
## 28   1977   5  624388 564040 1.106992 1.113445   23.4819775    95.1739102
## 29   1978   5  739976 668528 1.106874 1.113445   28.8663512    95.1739102
## 30   1979   5  864182 780802 1.106788 1.113445   34.6021980    95.1739102
## 31   1980   5  858334 779340 1.101360 1.113445  113.8120512    95.1739102
## 32   1981   5  888444 800862 1.109360 1.113445   13.3642605    95.1739102
## 33   1982   5  862214 765428 1.126447 1.113445  129.4015811    95.1739102
## 34   1983   5  895700 790100 1.133654 1.113445  322.6889519    95.1739102
## 35   1977   6  666792 624388 1.067913 1.072736   14.5232459    95.4346025
## 36   1978   6  787966 739976 1.064853 1.072736   45.9752469    95.4346025
## 37   1979   6  920268 864182 1.064901 1.072736   53.0508012    95.4346025
## 38   1980   6  918566 858334 1.070173 1.072736    5.6366537    95.4346025
## 39   1981   6  951194 888444 1.070629 1.072736    3.9429272    95.4346025
## 40   1982   6  944614 862214 1.095568 1.072736  449.4787401    95.4346025
## 41   1977   7  698030 666792 1.046848 1.047559    0.3369445    14.7731099
## 42   1978   7  823542 787966 1.045149 1.047559    4.5761913    14.7731099
## 43   1979   7  958764 920268 1.041831 1.047559   30.1914463    14.7731099
## 44   1980   7  964134 918566 1.049608 1.047559    3.8554008    14.7731099
## 45   1981   7 1002194 951194 1.053617 1.047559   34.9055665    14.7731099
## 46   1977   8  719282 698030 1.030446 1.034211    9.8952347    12.6617691
## 47   1978   8  848360 823542 1.030136 1.034211   13.6760755    12.6617691
## 48   1979   8  992532 958764 1.035220 1.034211    0.9771982    12.6617691
## 49   1980   8 1002134 964134 1.039414 1.034211   26.0985679    12.6617691
## 50   1977   9  735904 719282 1.023109 1.026047    6.2066637     2.9989596
## 51   1978   9  871022 848360 1.026713 1.026047    0.3763507     2.9989596
## 52   1979   9 1019932 992532 1.027606 1.026047    2.4138644     2.9989596
## 53   1977  10  750344 735904 1.019622 1.020188    0.2353251     0.2170726
## 54   1978  10  889022 871022 1.020665 1.020188    0.1988201     0.2170726
## 55   1977  11  762544 750344 1.016259 1.016259    0.0000000     0.2170726

And there we have it ! Thoose estimates are clearly the same as Mack’s. Indeed :

model4 <- MackChainLadder(ABC,est.sigma="Mack")
  resultat <- 
    df %>% 
      group_by(dev) %>% 
      summarise(phi=mean(estimated.phi)) %>% 
      mutate(
        mack = model4$sigma^2,
        degree.of.freedom = max(dev) - dev, # correction for degree of freedom
        phi = (phi * (degree.of.freedom+1)/(degree.of.freedom)) %>%
          .[1:(length(.)-1)] %>%
          {c(.,min(.[length(.)],.[length(.)-1],.[length(.)]^2/.[length(.)-1]))},
        f = model$coefficients[1:(dim(ABC)[1]-1)],
        f.cl = model4$f[1:(dim(ABC)[1]-1)]
        ) %>%
      select(f,f.cl,phi,mack) %>%
      as.data.frame

f	f.cl	phi	mack
2.308599	2.308599	2155.6009942	2155.6009942
1.421098	1.421098	616.5196286	616.5196286
1.199934	1.199934	238.0827301	238.0827301
1.113445	1.113445	111.0362286	111.0362286
1.072736	1.072736	114.5215230	114.5215230
1.047559	1.047559	18.4663874	18.4663874
1.034211	1.034211	16.8823588	16.8823588
1.026047	1.026047	4.4984394	4.4984394
1.020188	1.020188	0.4341453	0.4341453
1.016259	1.016259	0.0418994	0.0418994

So, ok, we reproduced Mack’s result in a new model. But under wich hypohtesis ? I see there linear models, that require generaly gaussian error, so some law hypothesis, right ?

The hypothesis we used are :

\((H1**)\) : The lines are independants
\((H2**)\) : \(\exists\, \mathbf{f}\text{ and }\mathbb{\sigma} \text{ such that } \forall i,j,\, F_{i,j} |\, w_{i,j} \sim \mathcal{N}(f_j, \frac{\sigma_j^2}{w_{i,j}})\)

Well, if we forget about the law assumption and only use quasi-glm (see, once again, McCullagh and Nelder (1989)), thoose are mack hypothesis.

Therefore, Mack’s model is a Quasi-glm, conditional, and weighted, with a normal-like variance assumption.

Some remarks :

Note that we playe a little with degree of freedom : Since we are estimating the \(\sigma\)’s after the \(f\)’s, we should take into account that \(n-1\) parameters are already estimated. This is classical in the joint modeling process.
Furthermore, the joint modeling process gives the choice of residuals (pearson residuals or deviance contribution), but in a the normal case they are equal.
Last but not least, the accointance for the kurtosis is not necessary here beacause of the normal variance assumption.

if you wanna try on your own triangle, feel free to download the gist i made with a check_model function.

So what ? Ok, Mack’s model is a Glm, and what ? Well, the categorisation of this classical hypothesis set as a GLM allows several things :

The use of standard statistiqcal tools : cook’s distance, residual analysis, etc…
The use of standard software : R’s glm function is probably more safe than ChainLadder’s specific function.
The simplicity of extention : The bornhutter-fergusson extension, e.g, can be easily introduced in the modelisation within a GLM framework, it’s harder within mack’s framework.

Next time, i’ll continue my analysis of Mack model by talking about one year risk and the Merz-wuthrich principles in this quasi-glm framwork.

References :

England, P. D., and R. J. Verrall. 2002. “Stochastic Claims Reserving in General Insurance.” British Actuarial Journal 8 (03): 443–518. https://doi.org/10/csqxh7.

Gesmann, Markus. 2009. “London R-User Group Meeting Claims Reserving in R the ChainLadder Package,” March, 35.

Mack, Thomas. 1991. “A Simple Parametric Model for Rating Automobile Insurance or Estimating IBNR Claims Reserves.” ASTIN Bulletin 21 (01): 93–109. https://doi.org/10.2143/AST.21.1.2005403.

McCullagh, P, and J. A Nelder. 1989. Generalisez Linear Models.

Chain-ladder Mack Glm Quasi-glm Reserving

Oskar Laverny

Maître de Conférence

What would be the dependence structure between quality of code and quantity of coffee ?