Add new expecteds and variances to an experience study
Source:R/mutate_expecvar.R
mutate_expecvar.Rd
mutate_expecvar()
uses a new expected rate for a decrement of interest and
adds a corresponding expected decrements column and corresponding variance
of expected decrements column. If there are already expecteds and variances
measures within the study dataset, either new, prefixed columns will be
added or the current expecteds and variances can be overwritten.
Usage
mutate_expecvar(
.data,
new_expected_rates,
new_expecvar_prefix = "auto",
measure_sets = guess_measure_sets(.data),
amount_scalar = NULL,
.by = NULL,
.keep = c("all", "used", "unused", "none"),
.before = NULL,
.after = NULL
)
Arguments
- .data
A
base::data.frame()
that houses an experience study.- new_expected_rates
A numeric vector to use as the expected probability for the study's event of interest (i.e., policy lapse or insured death). This can be a column in the dataset or a new numeric vector of length 1 or
nrow(.data)
.- new_expecvar_prefix
A string to distinguish the new expecteds and variances columns in the dataset. To overwrite existing expecteds and variances columns, use an argument value of
NULL
,character()
, or''
. The default'auto'
will add a numeric prefix based on the previous names of expecteds/variances so that names will remain unique.- measure_sets
A (potentially named) list of measure sets. Only need to specify once if chaining multiple
expstudy
functions as themeasure_sets
will be passed as an attribute in results.- amount_scalar
A numeric vector to use when determining amount-weighted expecteds and variances. The function will determine whether or not the new expecteds/variances are amount-weighted if the corresponding actuals in the study have values greater than 1 (actuals that are not amount-weighted, i.e., counts, should only be 0 or 1).
- .by
-
<
tidy-select
> Optionally, a selection of columns to group by for just this operation, functioning as an alternative togroup_by()
. For details and examples, see ?dplyr_by. - .keep
Control which columns from
.data
are retained in the output. Grouping columns and columns created by...
are always kept."all"
retains all columns from.data
. This is the default."used"
retains only the columns used in...
to create new columns. This is useful for checking your work, as it displays inputs and outputs side-by-side."unused"
retains only the columns not used in...
to create new columns. This is useful if you generate new columns, but no longer need the columns used to generate them."none"
doesn't retain any extra columns from.data
. Only the grouping variables and columns created by...
are kept.
- .before, .after
<
tidy-select
> Optionally, control where new columns should appear (the default is to add to the right hand side). Seerelocate()
for more details.
Value
An object of the same type as .data
. The output has the following
properties:
Columns from
.data
will be preserved according to the.keep
argument.Existing columns that are modified by
...
will always be returned in their original location.New columns created through
...
will be placed according to the.before
and.after
arguments.The number of rows is not affected.
Columns given the value
NULL
will be removed.Groups will be recomputed if a grouping variable is mutated.
Data frame attributes are preserved.
Underlying Assumptions
This function was developed according to current industry practice relating to experience study calculations. Some of the assumptions incorporated are briefly outlined below.
The experience study data is at a seriatim level where repeated observations of multiple units can exist. For example, the study data can contain experience for multiple policies over multiple calendar or policy years.
Each decrement event can be described as a Bernoulli random variable with expected rate of decrement equal to $p$. Furthermore, combining multiple observation units with equal rates of decrement $p$ can be considered a Binomial random variable with $n$ equal to the number of observation units.
Decrements are considered to be uniform between observations.
With these assumptions, new expecteds that are not amount-weighted are calculated as the product of exposures and the expected decrement rate, new variances are calculated as the product of the previously calculated new expecteds and 1 minus the previously calculated new expecteds. Amount-weighted expecteds and variances follow the prior calculations and additionally multiply by the amount scalar and amount scalar squared, respectively.
For a more detailed explanation of these methods used, please refer to the Society of Actuary's publication over experience study calculations.
Naming convention
expstudy
uses a naming convention where some functions are prefixed by the
underling dplyr
verb. The purpose of this is to associate the resulting
structure of the expstudy
function with a very similar output as what the
dplyr
function would produce. Note that the intention here is not replace
all dplyr
use cases but instead add specific functionality to streamline
routine experience study analyses.
Examples
mortexp |>
dplyr::mutate(
NEW_EXPECTED_MORT_RT = runif(n = nrow(mortexp))
) |>
mutate_expecvar(
new_expected_rates = NEW_EXPECTED_MORT_RT,
new_expecvar_prefix = 'ADJ_',
amount_scalar = FACE_AMOUNT
)
#> # A tibble: 176,096 × 28
#> AS_OF_DATE POLICY_HOLDER GENDER SMOKING_STATUS UNDERWRITING_CLASS FACE_AMOUNT
#> * <date> <fct> <fct> <fct> <fct> <dbl>
#> 1 1998-04-30 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> 2 1998-05-31 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> 3 1998-06-30 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> 4 1998-07-31 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> 5 1998-08-31 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> 6 1998-09-30 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> 7 1998-10-31 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> 8 1998-11-30 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> 9 1998-12-31 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> 10 1999-01-31 PH_0001 FEMALE NON-SMOKER STANDARD 5000
#> # ℹ 176,086 more rows
#> # ℹ 22 more variables: INSURED_DOB <date>, ISSUE_DATE <date>,
#> # TERMINATION_DATE <date>, ISSUE_AGE <dbl>, ATTAINED_AGE <dbl>,
#> # EXPECTED_MORTALITY_RT <dbl>, POLICY_DURATION_YR <dbl>,
#> # POLICY_DURATION_MNTH <int>, POLICY_STATUS <fct>, MORT_EXPOSURE_CNT <dbl>,
#> # MORT_EXPOSURE_AMT <dbl>, MORT_ACTUAL_CNT <dbl>, MORT_ACTUAL_AMT <dbl>,
#> # MORT_EXPECTED_CNT <dbl>, MORT_EXPECTED_AMT <dbl>, …