Aggregate an experience study — summarise

summarise_measures() functions the same as dplyr::summarise() and returns a new data frame per combination of grouping variable. However, this function is is streamlined to return the sum of an experience study's measures instead of any arbitrary summary function. These measures are identified via the measure_sets argument which can be provided directly or be guessed using regular expressions (regexs). See guess_measure_sets() for additional detail on how this guessing is implemented.

Usage

summarise_measures(
  .data,
  measure_sets = guess_measure_sets(.data),
  na.rm = TRUE,
  .groups = "drop",
  .by = NULL
)

Arguments

.data

A base::data.frame() that houses an experience study.

measure_sets

A (potentially named) list of measure sets. Only need to specify once if chaining multiple expstudy functions as the measure_sets will be passed as an attribute in results.

na.rm

logical. Should missing values (including NaN) be removed?

.groups

Grouping structure of the result.

"drop_last": dropping the last level of grouping. This was the only supported option before version 1.0.0.
"drop": All levels of grouping are dropped.
"keep": Same grouping structure as .data.
"rowwise": Each row is its own group.

When .groups is not specified, it is chosen based on the number of rows of the results:

If all the results have 1 row, you get "drop_last".
If the number of rows varies, you get "keep" (note that returning a variable number of rows was deprecated in favor of reframe(), which also unconditionally drops all levels of grouping).

In addition, a message informs you of that choice, unless the result is ungrouped, the option "dplyr.summarise.inform" is set to FALSE, or when summarise() is called from a function in a package.

.by

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

Value

An object usually of the same type as .data.

The rows come from the underlying group_keys().
The columns are a combination of the grouping keys and the summary expressions that you provide.
The grouping structure is controlled by the .groups= argument, the output may be another grouped_df, a tibble or a rowwise data frame.
Data frame attributes are not preserved, because summarise() fundamentally creates a new data frame.

Naming convention

expstudy uses a naming convention where some functions are prefixed by the underling dplyr verb. The purpose of this is to associate the resulting structure of the expstudy function with a very similar output as what the dplyr function would produce. Note that the intention here is not replace all dplyr use cases but instead add specific functionality to streamline routine experience study analyses.

Examples

mortexp |>
  dplyr::group_by(
    UNDERWRITING_CLASS
  ) |>
  summarise_measures()
#> # A tibble: 3 × 9
#>   UNDERWRITING_CLASS MORT_ACTUAL_CNT MORT_EXPOSURE_CNT MORT_EXPECTED_CNT
#> * <fct>                        <dbl>             <dbl>             <dbl>
#> 1 PREFERRED                       35             1428.              26.4
#> 2 SELECT                          68             3460.              61.1
#> 3 STANDARD                       212             9408.             169. 
#> # ℹ 5 more variables: MORT_VARIANCE_CNT <dbl>, MORT_ACTUAL_AMT <dbl>,
#> #   MORT_EXPOSURE_AMT <dbl>, MORT_EXPECTED_AMT <dbl>, MORT_VARIANCE_AMT <dbl>