Creating Exposure Intervals

Synthetic data called “records” is included in the package. To make an exposure frame the data must have “key”, “start”, and “end” columns with unique values in the key column.

expstudies::records
key start end issue_age gender
B10251C8 2010-04-10 2019-04-04 35 M
D68554D5 2005-01-01 2019-04-04 30 F

The addExposures function creates rows representing exposures between the start and end date with calculated exposures. By default exposure rows are created for each policy year.

exposures <- addExposures(records)
head(exposures)
key duration start_int end_int exposure
B10251C8 1 2010-04-10 2011-04-09 0.9993
B10251C8 2 2011-04-10 2012-04-09 1.002
B10251C8 3 2012-04-10 2013-04-09 0.9993
B10251C8 4 2013-04-10 2014-04-09 0.9993
B10251C8 5 2014-04-10 2015-04-09 0.9993
B10251C8 6 2015-04-10 2016-04-09 1.002

One exposure unit is 365.25 days so the exposure column will be either slightly above or below 1. Giving days different weights depending on if they are in a leap year or not yields higher mortality rates for leap years when mortality is constant which is not desirable.

addExposures() arguments

type

There are several options available for exposure calculations. For example, we can create exposure rows by policy month.

exposures_PM <- addExposures(records, type = "PM")
head(exposures_PM)
key duration policy_month start_int end_int exposure
B10251C8 1 1 2010-04-10 2010-05-09 0.08214
B10251C8 1 2 2010-05-10 2010-06-09 0.08487
B10251C8 1 3 2010-06-10 2010-07-09 0.08214
B10251C8 1 4 2010-07-10 2010-08-09 0.08487
B10251C8 1 5 2010-08-10 2010-09-09 0.08487
B10251C8 1 6 2010-09-10 2010-10-09 0.08214

The policy year and policy month options only do policy anniversary studies because exposure intervals may cross calendar years. There are options for creating exposure rows that do not cross calendar years or calendar months to allow for calendar year or calendar month studies.

Policy year with calendar year:

exposures_PYCY <- addExposures(records, type = "PYCY")
head(exposures_PYCY)
key duration start_int end_int exposure
B10251C8 1 2010-04-10 2010-12-31 0.7283
B10251C8 1 2011-01-01 2011-04-09 0.271
B10251C8 2 2011-04-10 2011-12-31 0.7283
B10251C8 2 2012-01-01 2012-04-09 0.2738
B10251C8 3 2012-04-10 2012-12-31 0.7283
B10251C8 3 2013-01-01 2013-04-09 0.271

Policy year with calendar month:

exposures_PYCM <- addExposures(records, type = "PYCM")
head(exposures_PYCM, n = 15)
key duration start_int end_int exposure
B10251C8 1 2010-04-10 2010-04-30 0.05749
B10251C8 1 2010-05-01 2010-05-31 0.08487
B10251C8 1 2010-06-01 2010-06-30 0.08214
B10251C8 1 2010-07-01 2010-07-31 0.08487
B10251C8 1 2010-08-01 2010-08-31 0.08487
B10251C8 1 2010-09-01 2010-09-30 0.08214
B10251C8 1 2010-10-01 2010-10-31 0.08487
B10251C8 1 2010-11-01 2010-11-30 0.08214
B10251C8 1 2010-12-01 2010-12-31 0.08487
B10251C8 1 2011-01-01 2011-01-31 0.08487
B10251C8 1 2011-02-01 2011-02-28 0.07666
B10251C8 1 2011-03-01 2011-03-31 0.08487
B10251C8 1 2011-04-01 2011-04-09 0.02464
B10251C8 2 2011-04-10 2011-04-30 0.05749
B10251C8 2 2011-05-01 2011-05-31 0.08487

Policy month with calendar year:

exposures_PYCM <- addExposures(records, type = "PMCY")
head(exposures_PYCM, n = 11)
key duration policy_month start_int end_int exposure
B10251C8 1 1 2010-04-10 2010-05-09 0.08214
B10251C8 1 2 2010-05-10 2010-06-09 0.08487
B10251C8 1 3 2010-06-10 2010-07-09 0.08214
B10251C8 1 4 2010-07-10 2010-08-09 0.08487
B10251C8 1 5 2010-08-10 2010-09-09 0.08487
B10251C8 1 6 2010-09-10 2010-10-09 0.08214
B10251C8 1 7 2010-10-10 2010-11-09 0.08487
B10251C8 1 8 2010-11-10 2010-12-09 0.08214
B10251C8 1 9 2010-12-10 2010-12-31 0.06023
B10251C8 1 9 2011-01-01 2011-01-09 0.02464
B10251C8 1 10 2011-01-10 2011-02-09 0.08487

Policy month with calendar month:

exposures_PMCM <- addExposures(records, type = "PMCM")
head(exposures_PMCM)
key duration policy_month start_int end_int exposure
B10251C8 1 1 2010-04-10 2010-04-30 0.05749
B10251C8 1 1 2010-05-01 2010-05-09 0.02464
B10251C8 1 2 2010-05-10 2010-05-31 0.06023
B10251C8 1 2 2010-06-01 2010-06-09 0.02464
B10251C8 1 3 2010-06-10 2010-06-30 0.05749
B10251C8 1 3 2010-07-01 2010-07-09 0.02464

lower_year and upper_year

There are arguments in the addExposures function that allow for truncation by calendar year. Exposure rows will only be created if the interval lies entirely within the specified years. This can reduce computation time and memory use.

Policy year with lower and upper truncation year:

exposures_PY_2016_to_2018 <- addExposures(records, type = "PY", lower_year = 2016, upper_year = 2018)
exposures_PY_2016_to_2018
key duration start_int end_int exposure
B10251C8 7 2016-04-10 2017-04-09 0.9993
B10251C8 8 2017-04-10 2018-04-04 0.9856
D68554D5 12 2016-01-01 2016-12-31 1.002
D68554D5 13 2017-01-01 2017-12-31 0.9993
D68554D5 14 2018-01-01 2018-04-04 0.2574

Policy year with calendar month and lower truncation year:

exposures_PYCM_2019 <- addExposures(records, type = "PYCM", lower_year = 2019)
exposures_PYCM_2019
key duration start_int end_int exposure
B10251C8 9 2019-01-01 2019-01-31 0.08487
B10251C8 9 2019-02-01 2019-02-28 0.07666
B10251C8 9 2019-03-01 2019-03-31 0.08487
B10251C8 9 2019-04-01 2019-04-04 0.01095
D68554D5 15 2019-01-01 2019-01-31 0.08487
D68554D5 15 2019-02-01 2019-02-28 0.07666
D68554D5 15 2019-03-01 2019-03-31 0.08487
D68554D5 15 2019-04-01 2019-04-04 0.01095

Determine Output Size Before Calling addExposures()

We can estimate the size of a call to addExposures() by using expSize(). We shouldn’t try to create 100 million rows, so we can use this function to make sure we don’t.

expSize(records, type = "PY")
## row_bound 
##        25

expSize() takes the same arguments as addExposures().

expSize(records, type = "PMCM", lower_year = 2015, upper_year = 2017)
## row_bound 
##       192

Adding additional information to the calculated exposures

We can add additional information by joining our original records to the created exposures by the key. Below we join the gender and issue age from our original record to the exposure frame and calculate the attained age.

exposures_mod <- exposures %>% inner_join(select(records, key, issue_age, gender), by = "key") %>%
  mutate(attained_age = issue_age + duration - 1)
head(exposures_mod)
key duration start_int end_int exposure issue_age gender attained_age
B10251C8 1 2010-04-10 2011-04-09 0.9993 35 M 35
B10251C8 2 2011-04-10 2012-04-09 1.002 35 M 36
B10251C8 3 2012-04-10 2013-04-09 0.9993 35 M 37
B10251C8 4 2013-04-10 2014-04-09 0.9993 35 M 38
B10251C8 5 2014-04-10 2015-04-09 0.9993 35 M 39
B10251C8 6 2015-04-10 2016-04-09 1.002 35 M 40

Making Daily Exposures

You can create a row for each policy day in an interval using the addDays() function.

head(addDays(records))
key date
B10251C8 2010-04-10
B10251C8 2010-04-11
B10251C8 2010-04-12
B10251C8 2010-04-13
B10251C8 2010-04-14
B10251C8 2010-04-15

There are options for lower and upper truncation dates

addDays(records, min_date = as.Date("2018-10-10"), max_date = as.Date("2018-10-12"))
key date
B10251C8 2018-10-10
B10251C8 2018-10-11
B10251C8 2018-10-12
D68554D5 2018-10-10
D68554D5 2018-10-11
D68554D5 2018-10-12

You can determine the size of the ouput without creating the output using the daySize() function.

daySize(records, min_date = as.Date("2018-10-10"), max_date = as.Date("2018-10-12"))
## num_rows 
##        6