In the model development, the “leave-one-out” prediction is a way of cross-validation, calculated as below:
1. First of all, after a model is developed, each observation used in the model development is removed in turn and then the model is refitted with the remaining observations
2. The out-of-sample prediction for the refitted model is calculated with the removed observation one by one to assemble the LOO, e.g. leave-one-out predicted values for the whole model development sample.
The loo_predict() function below is a general routine to calculate the LOO prediction for any GLM object, which can be further employed to investigate the model stability and predictability.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
>
pkgs <- c ( 'doParallel' ,
'foreach' ) >
lapply (pkgs,
require, character.only = T) [[1]] [1]
TRUE [[2]] [1]
TRUE >
registerDoParallel (cores
= 8) > >
data (AutoCollision,
package = "insuranceData" ) >
#
A GAMMA GLM # >
model1 <- glm (Severity
~ Age + Vehicle_Use, data = AutoCollision, family = Gamma (link
= "log" )) >
#
A POISSON GLM # >
model2 <- glm (Claim_Count
~ Age + Vehicle_Use, data = AutoCollision, family = poisson (link
= "log" )) > >
loo_predict <- function (obj)
{ +
yhat <- foreach (i
= 1: nrow (obj$data),
.combine = rbind) %dopar% { +
predict ( update (obj,
data = obj$data[-i, ]), obj$data[i,], type = "response" ) +
} +
return ( data.frame (result
= yhat[, 1], row.names = NULL )) +
} >
#
TEST CASE 1 >
test1 <- loo_predict (model1) >
test1$result [1]
303.7393 328.7292 422.6610 375.5023 240.9785 227.6365 288.4404 446.5589 [9]
213.9368 244.7808 278.7786 443.2256 213.9262 243.2495 266.9166 409.2565 [17]
175.0334 172.0683 197.2911 326.5685 187.2529 215.9931 249.9765 349.3873 [25]
190.1174 218.6321 243.7073 359.9631 192.3655 215.5986 233.1570 348.2781 >
#
TEST CASE 2 >
test2 <- loo_predict (model2) >
test2$result [1]
11.15897 37.67273 28.76127 11.54825 50.26364 152.35489 122.23782 [8]
44.57048 129.58158 465.84173 260.48114 107.23832 167.40672 510.41127 [15]
316.50765 121.75804 172.56928 546.25390 341.03826 134.04303 359.30141 [22]
977.29107 641.69934 251.32547 248.79229 684.86851 574.13994 238.42209 [29]
148.77733 504.12221 422.75047 167.61203 |