Use of SynthETIC to Generate Individual Claims of Realistic Features

This vignette aims to illustrate how the SynthETIC package can be used to generate a general insurance claims history with realistic distributional assumptions consistent with the experience of a specific (but anonymous) Auto Liability portfolio. The simulator is composed of 8 modelling steps (or modules), each of which will build on (a selection of) the output from previous steps:

  1. Claim occurrence: claim frequency, claim occurrence times
  2. Claim size: claim size in constant dollar values i.e. without inflation
  3. Claim notification: notification delay (delay from occurrence to notification)
  4. Claim closure: settlement delay (delay from notification to closure)
  5. Claim payment count: number of partial payments
  6. Claim payment size: sizes of partial payments in constant dollar values i.e. without inflation
  7. Claim payment time: inter-partial-payment delays, partial payment times in calendar period
  8. Claim inflation: sizes of inflated partial payments

In particular, with this demo we will output

Description R Object
N, claim frequency n_vector = # claims for each accident period
U, claim occurrence time occurrence_times[[i]] = claim occurrence time for all claims that occurred in period i
S, claim size claim_sizes[[i]] = claim size for all claims that occurred in period i
V, notification delay notidel[[i]] = notification delay for all claims that occurred in period i
W, settlement delay setldel[[i]] = settlement delay for all claims that occurred in period i
M, number of partial payments no_payments[[i]] = number of partial payments for all claims that occurred in period i
size of partial payments payment_sizes[[i]][[j]] = $ partial payments for claim j of occurrence period i
inter-partial delays payment_delays[[i]][[j]] = inter partial delays for claim j of occurrence period i
payment times (continuous time) payment_times[[i]][[j]] = payment times (in continuous time) for claim j of occurrence period i
payment times (period) payment_periods[[i]][[j]] = payment times (in calendar periods) for claim j of occurrence period i
actual payments (inflated) payment_inflated[[i]][[j]] = $ partial payments (inflated) for claim j of occurrence period i

Reference

For a full description of SythETIC’s structure and test parameters, readers should refer to:

Avanzi B, Taylor G, Wang M, Wong B (2020). “SynthETIC: an individual insurance claim simulator with feature control”. arXiv:2008.05693.

Set Up

library(SynthETIC)
set.seed(20200131)

Package-wise Global Parameters

We introduce the reference value ref_claim partly as a measure of the monetary unit and/or overall claims experience. The default distributional assumptions were set up with a specific (but anonymous) Auto Liability portfolio in mind. ref_claim then allows users to easily simulate a synthetic portfolio with similar claim pattern but in a different currency, for example. We also remark that users can alternatively choose to interpret ref_claim as a monetary unit. For example, one can set ref_claim <- 1000 and think of all amounts in terms of $1,000. However, in this case the default functions (as listed below) will not work and users will need to supply their own set of functions and set the values as multiples of ref_claim rather than fractions as in the default setting.

We also require the user to input a time_unit (which should be given as a fraction of year), so that the default input parameters apply to contexts where the time units are no longer in quarters. In the default setting we have a time_unit of 1/4.

The default input parameters will update automatically with the choice of the two global variables ref_claim and time_unit, which ensures that the simulator produce sensible results in contexts other than the default setting. We remark that both ref_claim and time_unit only affect the default simulation functions, and users can also choose to set up their own modelling assumptions for any of the modules to match their experiences even better. In the latter case, it is the responsibility of the user to ensure that their input parameters are compatible with their time units and claims experience. For example, if the time units are quarters, then claim occurrence rates must be quarterly.

set_parameters(ref_claim = 200000, time_unit = 1/4)
ref_claim <- return_parameters()[1]
time_unit <- return_parameters()[2]

The reference value, ref_claim will be used throughout the simulation process (as listed in the table below).

Module Details
2. Claim Size At ref_claim = 200000, by default we simulate claim sizes from S^0.2 ~ Normal (9.5, sd = 3), left truncated at 30.
When the reference value changes, we output the claim sizes scaled by a factor of ref_claim / 200000.
3. Claim Notification By default we set the mean notification delay (in quarters) to be \[min(3, max(1, 2 - \frac{1}{3} \log(\frac{claim\_size}{0.5~ref\_claim}))\] (which will be automatically converted to the relevant time_unit) i.e. the mean notification delay decreases logarithmically with claim size. It has maximum value 3 and equals 2 for a claim of size exactly at 0.5*ref_claim.
4. Claim Closure The default value for the mean settlement delay involves a term that defines the benchmark for a claim to be considered “small”: 0.1*ref_claim. The default mean settlement delay increases logarithmically with claim size and equals 6 exactly at this benchmark. Furthermore there was a legislative change, captured in the default mean function, that impacted the settlement delays of those “small” claims.
5. Claim Payment Count We need two claim size benchmarks as we sample from different distributions for claims of different sizes. In general a small number of partial payments is required to settle small claims, and additional payments will be required to settle more extreme claims.
It is assumed that claims below 0.0375*ref_claim can be settled in 1 or 2 payments, claims between 0.075*ref_claim in 2 or 3 payments, and claims beyond 0.075*ref_claim in no less than 4 payments.
6. Claim Payment Size We use the same proportion of ref_claim as in the Claim Closure module, namely 0.1*ref_claim. This benchmark value is used when simulating the proportion of the last two payments in the default simulate_amt_pmt function.
The mean proportion of claim paid in the last two payments increases logarithmically with claim size, and equals 75% exactly at this benchmark.
8. Claim Inflation Two benchmarks values are required in this section, one each for the default SI occurrence and SI payment functions.
1) A legislative change, captured by SI occurrence, reduced claim size by up to 40% for the smallest claims and impacted claims up to 0.25*ref_claim in size.
2) The default SI payment is set to be 30% p.a. for the smallest claims and zero for claims exceeding ref_claim in size, and varies linearly for claims between 0 and ref_claim.

The time_unit chosen will impact the time-related modules, specifically

1. Claim Occurrence

Input parameters

Implementation and Output

# Number of claims ocurring for each period i
# shorter equivalent code:
# n_vector <- claim_frequency()
n_vector <- claim_frequency(I, E, lambda)
n_vector
#>  [1]  90  79 102  78  86  88 116  84  93 104  80  87  86 104  81  84 101  96  96
#> [20]  86 102 103  82  83  80  80  82  87 103  79  79 100  94  99  88 101  91  95
#> [39]  91  84

# Occurrence time of each claim r, for each period i
occurrence_times <- claim_occurrence(n_vector)
occurrence_times[[1]]
#>  [1] 0.6238351404 0.1206679437 0.2220435985 0.4538308736 0.5910992266
#>  [6] 0.9524491858 0.3660710892 0.1923275446 0.5391526092 0.7398599708
#> [11] 0.9761979643 0.6794459166 0.6491731463 0.0145699105 0.0117662018
#> [16] 0.0002802343 0.1229670814 0.2181776366 0.9188914341 0.3641183279
#> [21] 0.3599445471 0.3228054109 0.7384824581 0.0756409415 0.2406489884
#> [26] 0.0309497463 0.1994408462 0.0391640882 0.1830444403 0.5194172878
#> [31] 0.8934622605 0.2604308173 0.8512500757 0.1738214253 0.4129021554
#> [36] 0.0683904318 0.0944415457 0.5636684340 0.4130775523 0.6496588932
#> [41] 0.2293977202 0.2929870863 0.1346096094 0.3428012058 0.5930486526
#> [46] 0.7660660581 0.7112241383 0.9488298327 0.0046397008 0.7370544358
#> [51] 0.1497760331 0.0386742705 0.1717934967 0.8123882010 0.3574451937
#> [56] 0.7511094357 0.2453237963 0.8360645119 0.7225212962 0.5654766215
#> [61] 0.0858555159 0.2943205256 0.4229451967 0.3454886819 0.6273976711
#> [66] 0.4686531660 0.6168212816 0.2097416152 0.0703774171 0.5280987371
#> [71] 0.2788692161 0.3355113363 0.3388684399 0.2468694879 0.1210995505
#> [76] 0.4063767171 0.1075867382 0.7758433735 0.5431794343 0.9817624143
#> [81] 0.4714252711 0.3129043274 0.8519159236 0.2192278604 0.2754109078
#> [86] 0.9434416124 0.7397910126 0.2484398137 0.5336137633 0.7483879288

2. Claim Size

Input parameters

Implementation and Output

3. Claim Notification

Input parameters

It is assumed that the notification delay of a claim follows a Weibull distribution, conditional on the size of the claim and/or period of occurrence. Required inputs are:

Implementation and Output

4. Claim Closure

Input parameters

It is assumed that the settlement delay of a claim also follows a Weibull distribution, conditional on the size of the claim and/or period of occurrence. Required inputs are:

Implementation and Output

5. Claim Partial Payment - Number of Partial Payments

Input parameters

Implementation and Output

Interlude: Claims Dataset

Use the following code to create a claims dataset containing all individual claims features.

claim_dataset <- generate_claim_dataset(
  frequency_vector = n_vector,
  occurrence_list = occurrence_times,
  claim_size_list = claim_sizes,
  notification_list = notidel,
  settlement_list = setldel,
  no_payments_list = no_payments
)
str(claim_dataset)
#> 'data.frame':    3624 obs. of  7 variables:
#>  $ claim_no         : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ occurrence_period: num  1 1 1 1 1 1 1 1 1 1 ...
#>  $ occurrence_time  : num  0.624 0.121 0.222 0.454 0.591 ...
#>  $ claim_size       : num  785871 22562 215771 117654 31627 ...
#>  $ notidel          : num  0.0652 1.1772 2.5262 0.9262 1.6507 ...
#>  $ setldel          : num  18.23 2.33 34 11.98 11.81 ...
#>  $ no_payment       : num  6 4 11 6 4 12 1 9 2 5 ...

test_claim_dataset, included as part of the package, is an example dataset of individual claims features using the default assumptions.

str(test_claim_dataset)
#> 'data.frame':    3624 obs. of  7 variables:
#>  $ claim_no         : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ occurrence_period: num  1 1 1 1 1 1 1 1 1 1 ...
#>  $ occurrence_time  : num  0.624 0.121 0.222 0.454 0.591 ...
#>  $ claim_size       : num  785871 22562 215771 117654 31627 ...
#>  $ notidel          : num  0.0652 1.1772 2.5262 0.9262 1.6507 ...
#>  $ setldel          : num  18.23 2.33 34 11.98 11.81 ...
#>  $ no_payment       : num  6 4 11 6 4 12 1 9 2 5 ...
head(test_claim_dataset, n = 20)
#>    claim_no occurrence_period occurrence_time  claim_size   notidel    setldel
#> 1         1                 1    0.6238351404 785870.7896 0.0651635 18.2280224
#> 2         2                 1    0.1206679437  22562.2930 1.1771658  2.3316240
#> 3         3                 1    0.2220435985 215770.7413 2.5262413 34.0027919
#> 4         4                 1    0.4538308736 117653.6987 0.9261816 11.9759831
#> 5         5                 1    0.5910992266  31626.9216 1.6507312 11.8062159
#> 6         6                 1    0.9524491858 250396.9013 3.6609172 20.9091424
#> 7         7                 1    0.3660710892    190.2072 2.7766764  0.2770122
#> 8         8                 1    0.1923275446 360166.2697 1.2849926 19.5977438
#> 9         9                 1    0.5391526092  14924.8014 4.0569045  4.9303382
#> 10       10                 1    0.7398599708  26964.3529 1.4496062  6.7787507
#> 11       11                 1    0.9761979643 682118.8380 1.3594494 18.0100483
#> 12       12                 1    0.6794459166  41026.2051 5.9362177  6.9523632
#> 13       13                 1    0.6491731463   7343.8590 2.8569213  1.9486036
#> 14       14                 1    0.0145699105   5791.1945 3.9369333  0.5143549
#> 15       15                 1    0.0117662018   6512.6619 3.1055364  2.8605859
#> 16       16                 1    0.0002802343   6818.6761 6.8930171  0.6980630
#> 17       17                 1    0.1229670814  49717.9791 7.0564950  3.2989462
#> 18       18                 1    0.2181776366 986362.9680 1.6235153 20.1353658
#> 19       19                 1    0.9188914341   8333.0424 2.7932130  0.2504500
#> 20       20                 1    0.3641183279 487251.8263 2.7563826 16.9715398
#>    no_payment
#> 1           6
#> 2           4
#> 3          11
#> 4           6
#> 5           4
#> 6          12
#> 7           1
#> 8           9
#> 9           2
#> 10          5
#> 11          7
#> 12          4
#> 13          2
#> 14          2
#> 15          2
#> 16          2
#> 17          5
#> 18          9
#> 19          3
#> 20          7

6. Claim Partial Payment - Sizes of Partial Payments (without inflation)

Input parameters

simulate_amt_pmt <- function(no_pmt, claim_size) {
  # WARNING: Do not change function arguments
  if (no_pmt >= 4) {
    ## 1) Simulate the "complement" of the proportion of total claim size represented by the last two payments
    p_mean <- 1 - min(0.95, 0.75 + 0.04*log(claim_size/(0.10 * ref_claim)))
    p_CV <- 0.20
    p_parameters <- get_Beta_parameters(target_mean = p_mean, target_cv = p_CV)
    last_two_pmts_complement <- stats::rbeta(1, shape1 = p_parameters[1], shape2 = p_parameters[2])
    last_two_pmts <- 1 - last_two_pmts_complement

    ## 2) Simulate the proportion of last_two_pmts paid in the second last payment
    q_mean <- 0.9
    q_CV <- 0.03
    q_parameters <- get_Beta_parameters(target_mean = q_mean, target_cv = q_CV)
    q <- stats::rbeta(1, shape1 = q_parameters[1], shape2 = q_parameters[2])

    ## 3) Calculate the respective proportions of claim amount paid in the last 2 payments
    p_second_last <- q * last_two_pmts
    p_last <- (1-q) * last_two_pmts

    ## 4) Simulate the "unnormalised" proportions of claim amount paid in the first (m - 2) payments
    p_unnorm_mean <- last_two_pmts_complement/(no_pmt - 2)
    p_unnorm_CV <- 0.10
    p_unnorm_parameters <- get_Beta_parameters(target_mean = p_unnorm_mean, target_cv = p_unnorm_CV)
    amt <- stats::rbeta(no_pmt - 2, 
                        shape1 = p_unnorm_parameters[1], shape2 = p_unnorm_parameters[2])

    ## 5) Normalise the proportions simulated in step 4
    amt <- last_two_pmts_complement * (amt/sum(amt))

    ## 6) Attach the last 2 proportions, p_second_last and p_last
    amt <- append(amt, c(p_second_last, p_last))

    ## 7) Multiply by claim_size to obtain the actual payment amounts
    amt <- claim_size * amt

  } else if (no_pmt == 2 | no_pmt == 3) {
    p_unnorm_mean <- 1/no_pmt
    p_unnorm_CV <- 0.10
    p_unnorm_parameters <- get_Beta_parameters(target_mean = p_unnorm_mean, target_cv = p_unnorm_CV)
    amt <- stats::rbeta(no_pmt, shape1 = p_unnorm_parameters[1], shape2 = p_unnorm_parameters[2])

    ## Normalise the proportions and multiply by claim_size to obtain the actual payment amounts
    amt <- claim_size * amt/sum(amt)

  } else {
    # when there is a single payment
    amt <- claim_size

  }

  stopifnot(length(amt) == no_pmt)
  return(amt)
}

Implementation and Output

7. Claim Payment Time

Input parameters

simulate_d <- function(no_pmt, claim_size, setldel, occurrence_period,
                       setldel_mean_function) {
  # WARNING: Do not change function arguments
  result <- c(rep(NA, no_pmt))

  # First simulate the unnormalised values of d, sampled from a Weibull distribution
  if (no_pmt >= 4) {
    # 1) Simulate the last payment delay
    unnorm_d_mean <- (1 / 4) / time_unit
    unnorm_d_cv <- 0.20
    parameters <- get_Weibull_parameters(target_mean = unnorm_d_mean, target_cv = unnorm_d_cv)
    result[no_pmt] <- stats::rweibull(1, shape = parameters[1], scale = parameters[2])

    # 2) Simulate all the other payment delays
    for (i in 1:(no_pmt - 1)) {
      unnorm_d_mean <- setldel_mean_function(claim_size = claim_size,
                                             occurrence_period = occurrence_period)/no_pmt
      unnorm_d_cv <- 0.35
      parameters <- get_Weibull_parameters(target_mean = unnorm_d_mean, target_cv = unnorm_d_cv)
      result[i] <- stats::rweibull(1, shape = parameters[1], scale = parameters[2])
    }

  } else {
    for (i in 1:no_pmt) {
      unnorm_d_mean <- setldel_mean_function(claim_size = claim_size,
                                             occurrence_period = occurrence_period)/no_pmt
      unnorm_d_cv <- 0.35
      parameters <- get_Weibull_parameters(target_mean = unnorm_d_mean, target_cv = unnorm_d_cv)
      result[i] <- stats::rweibull(1, shape = parameters[1], scale = parameters[2])
    }
  }

  stopifnot(sum(is.na(result)) == 0)
  # Normalise d such that sum(inter-partial delays) = settlement delay
  # To make sure that the pmtdels add up exactly to setldel, we treat the last one separately
  result[1:no_pmt-1] <- (setldel/sum(result)) * result[1:no_pmt-1]
  result[no_pmt] <- setldel - sum(result[1:no_pmt-1])

  return(result)
}

Implementation and Output

8. Claim Inflation

Input parameters

Implementation and Output

Interlude: Transaction Dataset

Use the following code to create a transactions dataset containing full information of all the partial payments made.

# construct a "claims" object to store all the simulated quantities
all_claims <- claims(
  frequency_vector = n_vector,
  occurrence_list = occurrence_times,
  claim_size_list = claim_sizes,
  notification_list = notidel,
  settlement_list = setldel,
  no_payments_list = no_payments,
  payment_size_list = payment_sizes,
  payment_delay_list = payment_delays,
  payment_time_list = payment_times,
  payment_inflated_list = payment_inflated
)
transaction_dataset <- generate_transaction_dataset(
  all_claims,
  adjust = FALSE # to keep the original (potentially out-of-bound) simulated payment times
)
str(transaction_dataset)
#> 'data.frame':    18983 obs. of  12 variables:
#>  $ claim_no         : int  1 1 1 1 1 1 2 2 2 2 ...
#>  $ pmt_no           : num  1 2 3 4 5 6 1 2 3 4 ...
#>  $ occurrence_period: num  1 1 1 1 1 1 1 1 1 1 ...
#>  $ occurrence_time  : num  0.624 0.624 0.624 0.624 0.624 ...
#>  $ claim_size       : num  785871 785871 785871 785871 785871 ...
#>  $ notidel          : num  0.0652 0.0652 0.0652 0.0652 0.0652 ...
#>  $ setldel          : num  18.2 18.2 18.2 18.2 18.2 ...
#>  $ payment_time     : num  4.2 7.1 11.2 14.4 18.5 ...
#>  $ payment_period   : num  5 8 12 15 19 19 3 3 4 4 ...
#>  $ payment_size     : num  25105 26177 26333 26341 592457 ...
#>  $ payment_inflated : num  25632 27113 27829 28294 649128 ...
#>  $ payment_delay    : num  3.51 2.9 4.06 3.29 4.01 ...

test_transaction_dataset, included as part of the package, is an example dataset showing full information of the claims features at a transaction/payment level, generated using the default assumptions.

str(test_transaction_dataset)
#> 'data.frame':    18983 obs. of  12 variables:
#>  $ claim_no         : int  1 1 1 1 1 1 2 2 2 2 ...
#>  $ pmt_no           : num  1 2 3 4 5 6 1 2 3 4 ...
#>  $ occurrence_period: num  1 1 1 1 1 1 1 1 1 1 ...
#>  $ occurrence_time  : num  0.624 0.624 0.624 0.624 0.624 ...
#>  $ claim_size       : num  785871 785871 785871 785871 785871 ...
#>  $ notidel          : num  0.0652 0.0652 0.0652 0.0652 0.0652 ...
#>  $ setldel          : num  18.2 18.2 18.2 18.2 18.2 ...
#>  $ payment_time     : num  4.2 7.1 11.2 14.4 18.5 ...
#>  $ payment_period   : num  5 8 12 15 19 19 3 3 4 4 ...
#>  $ payment_size     : num  25105 26177 26333 26341 592457 ...
#>  $ payment_inflated : num  25632 27113 27829 28294 649128 ...
#>  $ payment_delay    : num  3.51 2.9 4.06 3.29 4.01 ...
head(test_transaction_dataset, n = 20)
#>    claim_no pmt_no occurrence_period occurrence_time claim_size   notidel
#> 1         1      1                 1       0.6238351  785870.79 0.0651635
#> 2         1      2                 1       0.6238351  785870.79 0.0651635
#> 3         1      3                 1       0.6238351  785870.79 0.0651635
#> 4         1      4                 1       0.6238351  785870.79 0.0651635
#> 5         1      5                 1       0.6238351  785870.79 0.0651635
#> 6         1      6                 1       0.6238351  785870.79 0.0651635
#> 7         2      1                 1       0.1206679   22562.29 1.1771658
#> 8         2      2                 1       0.1206679   22562.29 1.1771658
#> 9         2      3                 1       0.1206679   22562.29 1.1771658
#> 10        2      4                 1       0.1206679   22562.29 1.1771658
#> 11        3      1                 1       0.2220436  215770.74 2.5262413
#> 12        3      2                 1       0.2220436  215770.74 2.5262413
#> 13        3      3                 1       0.2220436  215770.74 2.5262413
#> 14        3      4                 1       0.2220436  215770.74 2.5262413
#> 15        3      5                 1       0.2220436  215770.74 2.5262413
#> 16        3      6                 1       0.2220436  215770.74 2.5262413
#> 17        3      7                 1       0.2220436  215770.74 2.5262413
#> 18        3      8                 1       0.2220436  215770.74 2.5262413
#> 19        3      9                 1       0.2220436  215770.74 2.5262413
#> 20        3     10                 1       0.2220436  215770.74 2.5262413
#>      setldel payment_time payment_period payment_size payment_inflated
#> 1  18.228022     4.197594              5    25104.778        25631.935
#> 2  18.228022     7.096012              8    26176.620        27112.546
#> 3  18.228022    11.157697             12    26333.187        27828.702
#> 4  18.228022    14.445762             15    26341.097        28293.904
#> 5  18.228022    18.452453             19   592456.914       649127.995
#> 6  18.228022    18.917021             19    89458.193        98240.945
#> 7   2.331624     2.203365              3     2005.477         2305.906
#> 8   2.331624     2.695137              3     2124.634         2520.223
#> 9   2.331624     3.316770              4    15986.059        19724.225
#> 10  2.331624     3.629458              4     2446.123         3078.507
#> 11 34.002792     5.929454              6     3376.711         3477.302
#> 12 34.002792    10.427212             11     3090.246         3253.959
#> 13 34.002792    11.576364             12     3687.927         3905.459
#> 14 34.002792    17.034167             18     3859.225         4198.792
#> 15 34.002792    18.594228             19     2592.965         2842.988
#> 16 34.002792    20.559911             21     3080.938         3411.046
#> 17 34.002792    24.728598             25     3305.560         3736.050
#> 18 34.002792    28.277329             29     3859.252         4439.158
#> 19 34.002792    31.552114             32     3237.664         3785.037
#> 20 34.002792    35.126071             36   168298.113       200263.503
#>    payment_delay
#> 1      3.5085951
#> 2      2.8984182
#> 3      4.0616851
#> 4      3.2880649
#> 5      4.0066914
#> 6      0.4645676
#> 7      0.9055316
#> 8      0.4917720
#> 9      0.6216322
#> 10     0.3126881
#> 11     3.1811688
#> 12     4.4977582
#> 13     1.1491522
#> 14     5.4578035
#> 15     1.5600603
#> 16     1.9656830
#> 17     4.1686873
#> 18     3.5487313
#> 19     3.2747851
#> 20     3.5739561

Output

# 1. Constant dollar value INCREMENTAL triangle
output <- claim_output(n_vector, payment_times, payment_sizes,
                       incremental = TRUE)

# 2. Constant dollar value CUMULATIVE triangle
output_cum <- claim_output(n_vector, payment_times, payment_sizes,
                           incremental = FALSE)

# 3. Actual (i.e. inflated) INCREMENTAL triangle
output_actual <- claim_output(n_vector, payment_times, payment_inflated,
                              incremental = TRUE)

# 4. Actual (i.e. inflated) CUMULATIVE triangle
output_actual_cum <- claim_output(n_vector, payment_times, payment_inflated,
                                  incremental = FALSE)

# Aggregate at a yearly level
claim_output(n_vector, payment_times, payment_sizes, aggregate_level = 4)
#>          [,1]     [,2]     [,3]     [,4]     [,5]     [,6]    [,7]      [,8]
#>  [1,] 2009590  8505554  8348936  8957307 10445776 10141882 3085936 5890648.5
#>  [2,] 2602776  9137704 11030093  6796913  7405667  7262836 3284628 4063720.5
#>  [3,] 5277813 12530806 11015397  8736816  4496450  5361189 4999800 2564993.4
#>  [4,] 2658895 10419023 12082357  9213348  5632499  7918255 2853011 1673988.0
#>  [5,] 3513930 12758814 13166262 10376179  6734269  6715046 2578470 2627688.2
#>  [6,] 3979376 12784277 11220522 10155186  8813239  6309899 2990714  875445.2
#>  [7,] 3558360 10780812  9486389 11397725  6512777  6618403 2797526 3418709.6
#>  [8,] 4040515  9521512 11020902  8469382  6003697  3708348 2188494 2297513.6
#>  [9,] 2683375 11067316  9561111  9473244 10422712  4275328 2115661 3223131.2
#> [10,] 3208185 15034265 10409376  8947175  5404483  5933266 5398764 1029935.2
#>             [,9]     [,10]
#>  [1,] 2783864.18 1826458.8
#>  [2,] 3357908.39 4287166.8
#>  [3,] 1810409.85 1011759.2
#>  [4,] 1977133.89 1324449.9
#>  [5,] 1945930.19 2156368.1
#>  [6,]  512205.17  682108.3
#>  [7,]  413480.52  313633.9
#>  [8,] 1358740.70 1116257.6
#>  [9,] 2035858.72 2854641.0
#> [10,]   99013.65  629403.4

Conversion to Time Series Objects

At any point in the analysis, the simulated output can be converted to a ts object by running:

list_as_ts <- stats::ts(list, start = , frequency = )

The conversion to ts objects is easy, but many functionalities with the ts class may not apply to the new ts objects created as they do not follow a rigid ts structure (which requires data to be sampled at equispaced points in time). The main advantage of conversion to time series is that data can now be characterised/indexed by time (see stats::window()).

to_convert <- c("n_vector", "occurrence_times", "claim_sizes", 
                "notidel", "setldel", "no_payments", "payment_sizes", 
                "payment_delays", "payment_times", "payment_inflated")
for (i in to_convert) {
  # equivalently, claim_sizes_ts <- ts(claim_sizes, start = c(2010, 4), frequency = 4)
  # and repeat for each of the output quantities
  list_original <- eval(as.name(i))
  list_as_ts <- stats::ts(list_original, start = c(2010, 1), frequency = 4)
  assign(paste(i, "_ts", sep=""), list_as_ts)
}

# display the simulated claim sizes by occurrence quarter for 2019
stats::window(claim_sizes_ts, start = c(2019, 1), end = c(2019, 4))
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Qtr1
#> 2019 15942.6061, 183383.0945, 146949.5274, 426064.7986, 119358.9725, 547119.4081, 193262.5310, 10568.5189, 3884.3312, 140390.5307, 14644.8292, 77153.6115, 38883.6457, 13104.7461, 138230.4752, 47149.1927, 27005.6597, 8160.0403, 17505.4247, 48985.9326, 148915.6589, 416267.6141, 54984.0358, 682.6724, 104734.6687, 55375.5358, 308092.3882, 98586.2031, 126039.6889, 697.2189, 22438.0680, 41160.2875, 67548.2750, 318732.6275, 52006.2335, 13543.2976, 2512.9142, 54710.8024, 28923.8367, 2268.4494, 5611.8123, 9049.0322, 329329.4035, 49815.8007, 92754.3472, 15059.5386, 22957.5863, 201160.0934, 79192.9610, 50019.1117, 21623.7874, 113059.1123, 437396.7382, 60088.9319, 302038.7745, 574588.6086, 98023.4058, 64709.0790, 428054.1348, 116068.3848, 90388.9430, 3225.9971, 1666.9717, 6075.5935, 189848.6483, 120156.9276, 556045.8974, 229756.5441, 219845.5236, 37468.5097, 222868.9305, 24684.6638, 117600.5678, 85087.7133, 72525.0807, 106559.2480, 191938.7266, 286264.4688, 16433.8384, 283730.7546, 211336.5223, 344931.2402, 7135.9388, 80927.9824, 539602.2773, 19245.8155, 321549.6572, 4543.5864, 99106.3295, 685582.5620, 36466.4661
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Qtr2
#> 2019 344988.6218, 334263.1567, 103857.1037, 27107.4613, 13437.1879, 229754.6470, 63035.9563, 139640.2145, 23438.6713, 47046.3980, 209269.6922, 975.1689, 207360.7619, 59967.4569, 4594.3086, 41494.7316, 113526.5467, 612679.8290, 16995.9710, 252061.2288, 23581.8598, 40826.3415, 190413.0265, 115740.1107, 14145.3147, 6464.5865, 18912.0079, 750766.2501, 13293.9404, 62634.2761, 149698.4672, 83418.7980, 555759.1085, 15467.0016, 22541.7811, 77715.0422, 11455.2837, 19698.0343, 40675.4973, 22244.6884, 160066.5794, 84282.4088, 127002.0011, 147817.3252, 504185.2397, 60297.0010, 68448.9843, 59207.5894, 76095.3773, 51665.5294, 15508.6283, 34765.9565, 104871.7543, 2414.9768, 120046.3502, 47764.7793, 1720440.3129, 472611.6892, 582358.3255, 452157.1159, 239191.5567, 17881.9283, 71882.3777, 1290.9784, 274347.3174, 97844.7941, 191418.0165, 17363.1235, 366579.9670, 84686.1270, 43598.6573, 73203.6470, 30813.5039, 202407.8407, 84307.6258, 222743.4582, 75569.4455, 47965.4069, 845479.5748, 254328.5964, 138142.2355, 849.3095, 64246.3612, 12482.2424, 43577.2832, 51855.7520, 560.2382, 12500.9384, 5082.2984, 72350.8980, 273204.2716, 390931.7499, 82798.4509, 124769.5510, 321.1051
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Qtr3
#> 2019 11265.6540, 156377.8939, 15032.0115, 315306.6356, 22203.7649, 107664.5029, 45833.1592, 131679.9917, 10634.2425, 17776.0189, 3139.3922, 135199.4816, 148886.4457, 49200.9337, 12995.2727, 53441.0382, 10394.9770, 40036.1131, 132178.0397, 7850.0476, 233006.6947, 4854.2938, 22765.5435, 169726.0318, 19794.5393, 325200.2918, 81010.0505, 91287.6241, 152711.8442, 83866.0192, 10503.4269, 86432.6253, 351747.0121, 235578.9959, 6702.8797, 60990.6149, 34365.5397, 2508.5331, 48872.9829, 258960.2051, 22936.5681, 518890.9825, 2831987.8000, 727774.3474, 34735.3760, 29554.2456, 224870.1406, 5910.6726, 380590.7927, 8350.4047, 26810.1167, 7867.3481, 65402.9407, 138093.7071, 413647.5578, 18684.4680, 131954.5097, 84022.7678, 164392.9074, 57527.7571, 100686.3146, 12663.4771, 121609.4028, 507142.8585, 9650.1174, 542848.7735, 331293.5935, 179806.7362, 15174.5041, 82588.0650, 3317.7434, 21997.7450, 180678.6622, 176652.8429, 79514.5208, 47949.0475, 398616.1195, 50646.0044, 4569.0209, 388.4624, 561767.6745, 84720.0868, 163232.8710, 38593.7278, 177009.0102, 11855.1228, 25744.0326, 1420.8688, 26930.8699, 18067.5170, 315699.6025
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Qtr4
#> 2019 2.931196e+04, 2.842948e+04, 9.921564e+05, 5.306111e+04, 1.628713e+06, 1.770657e+05, 2.096915e+05, 4.408395e+04, 5.489969e+04, 2.792973e+04, 2.012338e+04, 1.342338e+05, 2.749026e+04, 4.891341e+04, 6.286061e+04, 6.102122e+04, 1.095112e+05, 8.311796e+05, 1.007868e+05, 7.166872e+05, 6.771066e+04, 1.920495e+05, 9.006534e+03, 8.909549e+04, 1.298898e+04, 5.087712e+05, 2.545302e+04, 5.394127e+04, 6.004005e+03, 8.469990e+04, 5.696675e+04, 6.237261e+03, 1.994032e+05, 4.069632e+04, 1.210119e+04, 1.154883e+05, 4.933205e+05, 4.957509e+04, 1.143457e+03, 6.839566e+04, 2.149010e+05, 1.030940e+04, 3.387904e+05, 2.933884e+04, 2.713148e+04, 3.788734e+04, 7.249265e+05, 4.039788e+04, 5.017148e+05, 1.039237e+05, 3.467156e+04, 5.440671e+04, 2.261136e+04, 2.342762e+05, 1.065787e+05, 1.553035e+05, 3.119232e+05, 2.330761e+05, 4.545071e+04, 1.163156e+05, 4.021204e+04, 1.053078e+05, 1.584778e+05, 3.153620e+04, 2.159009e+05, 1.450122e+03, 2.242216e+05, 3.397114e+05, 4.601594e+03, 4.049227e+05, 2.665915e+05, 7.961880e+02, 1.480091e+05, 8.767667e+04, 8.324513e+03, 3.949678e+01, 1.134828e+05, 7.432735e+05, 1.380562e+05, 8.596601e+05, 1.005980e+05, 2.074477e+05, 2.417352e+04, 2.707373e+05

Plot of Cumulative Claims Payments

Note that by default, similar to the case of claim_output and claim_payment_inflation, we will truncate the claims development such that payments that were projected to fall out of the maximum development period are forced to be paid at the exact end of the maximum development period allowed. This convention will cause some concentration of transactions at the end of development period \(I\) (shown as a surge in claims in the \(I\)th period).

Users can set adjust = FALSE to see the “true” picture of claims development without such artificial adjustment. If the plots look significantly different, this indicates to the user that the user’s selection of lag parameters (notification and/or settlement delays) is not well matched to the maximum number of development periods allowed, and consideration might be given to changing one or the other.

plot(test_claims_object)

# compare with the "full complete picture"
plot(test_claims_object, adjust = FALSE)

# plot by occurrence and development years
plot(test_claims_object, by_year = TRUE)

Multiple Simulation Runs

Once all the input parameters have been set up, we can repeat the simulation process as many times as desired through a for loop. The code below saves the transaction dataset generated by each simulation run as a component of results_all.

times <- 100
results_all <- vector("list")
for (i in 1:times) {
  # Module 1: Claim occurrence
  n_vector <- claim_frequency(I, E, lambda)
  occurrence_times <- claim_occurrence(n_vector)
  # Module 2: Claim size
  claim_sizes <- claim_size(n_vector, S_df, range = c(0, 1e24))
  # Module 3: Claim notification
  notidel <- claim_notification(n_vector, claim_sizes, notidel_mean, notidel_cv)
  # Module 4: Claim settlement
  setldel <- claim_closure(n_vector, claim_sizes, setldel_mean, setldel_cv)
  # Module 5: Claim payment count
  no_payments <- claim_payment_no(n_vector, claim_sizes, simulate_no_pmt,
                                  claim_size_benchmark_1 = benchmark_1,
                                  claim_size_benchmark_2 = benchmark_2)
  # Module 6: Claim payment size
  payment_sizes <- claim_payment_size(n_vector, claim_sizes, no_payments, simulate_amt_pmt)
  # Module 7: Claim payment time
  payment_delays <- claim_payment_delay(n_vector, claim_sizes, no_payments, setldel,
                                        setldel_mean, simulate_d)
  payment_times <- claim_payment_time(n_vector, occurrence_times, notidel, payment_delays)
  # Module 8: Claim inflation
  payment_inflated <- claim_payment_inflation(
    n_vector, payment_sizes, payment_times, occurrence_times,
    claim_sizes, base_inflation_vector, SI_occurrence, SI_payment)
  
  results_all[[i]] <- generate_transaction_dataset(
    claims(
      frequency_vector = n_vector,
      occurrence_list = occurrence_times,
      claim_size_list = claim_sizes,
      notification_list = notidel,
      settlement_list = setldel,
      no_payments_list = no_payments,
      payment_size_list = payment_sizes,
      payment_delay_list = payment_delays,
      payment_time_list = payment_times,
      payment_inflated_list = payment_inflated),
    # adjust = FALSE to retain the original simulated times
    adjust = FALSE)
}

What if we are interested in seeing the average claims development over a large number of simulation runs? The plot.claims function in this package at present only works for a single claims object so we need to come up with a way to combine the claims objects generated by each run. A much simpler alternative would be to just increase the exposure rates and plot the resulting claims object. This has the same effect as averaging over a large number of simulation runs.

This long-run average of claims development offers insights into the effects of the distributional assumptions that users have made throughout the way, and hence the reasonableness of such choices.

The code below runs only for 10 simulations and we can already see the trend emerging, which matches with the result of our single simulation run above. Increasing times to run simulation will show a smoother trend, which we refrain from producing here because running simulation on this amount of data takes some time (100 simulations take around 10 minutes on a quad-core machine). We remark that the major simulation lags are caused by the claim_payment_delay and (less severely) claim_payment_size functions.

start.time <- proc.time()
times <- 10

# increase exposure to E*times to get the same results as the aggregation of
# multiple simulation runs
n_vector <- claim_frequency(I, E = E * times, lambda)
occurrence_times <- claim_occurrence(n_vector)
claim_sizes <- claim_size(n_vector)
notidel <- claim_notification(n_vector, claim_sizes, notidel_mean, notidel_cv)
setldel <- claim_closure(n_vector, claim_sizes, setldel_mean, setldel_cv)
no_payments <- claim_payment_no(n_vector, claim_sizes, simulate_no_pmt,
                                claim_size_benchmark_1 = benchmark_1,
                                claim_size_benchmark_2 = benchmark_2)
payment_sizes <- claim_payment_size(n_vector, claim_sizes, no_payments, simulate_amt_pmt)
payment_delays <- claim_payment_delay(n_vector, claim_sizes, no_payments, setldel,
                                      setldel_mean, simulate_d)
payment_times <- claim_payment_time(n_vector, occurrence_times, notidel, payment_delays)
payment_inflated <- claim_payment_inflation(
  n_vector, payment_sizes, payment_times, occurrence_times,
  claim_sizes, base_inflation_vector, SI_occurrence, SI_payment)

all_claims <- claims(
  frequency_vector = n_vector,
  occurrence_list = occurrence_times,
  claim_size_list = claim_sizes,
  notification_list = notidel,
  settlement_list = setldel,
  no_payments_list = no_payments,
  payment_size_list = payment_sizes,
  payment_delay_list = payment_delays,
  payment_time_list = payment_times,
  payment_inflated_list = payment_inflated
)
plot(all_claims, adjust = FALSE) +
  ggplot2::labs(subtitle = paste("With", times, "simulations"))

proc.time() - start.time
#>    user  system elapsed 
#>  33.512   0.358  34.036

Users can also choose to plot by occurrence year, or remove the inflation by altering the arguments by_year and inflated in

plot(claims, by_year = , inflated = , adjust = )