Skip to contents

Computes the attributable fraction in the exposed (AFE) and the population attributable fraction (PAF) with 95% Confidence Intervals (95% CIs) for a binary exposure (coded 0/1) and either:

  • a binary outcome (prevalence / risk) when y_t is NULL, or

  • a time-to-event outcome (incidence) using a Cox proportional hazards model when y_t is provided.

Binary-outcome case:

  • A logistic regression is fitted and AFE/PAF are estimated by standardisation (g-formula): mean predicted risks under observed exposure and counterfactual exposure assignments

  • If z != "" (i.e., covariates provided) then the logistic regression is adjusted. If n_boot > 0, non-parametric bootstrap 95% CIs are returned for AFE/PAF.

Time-to-event case: A Cox model is fitted and the hazard ratio (HR) for exposure is used as the effect measure. Wald 95% confidence intervals are taken from the Cox model, then transformed to AFE and PAF. Crude incidence rates (per 1,000 person-years) are also returned by exposure group (assumes time is in years).

Usage

paf(
  d,
  x,
  y,
  y_t = NULL,
  z = "",
  n_boot = 0L,
  skip_boot = TRUE,
  use_parallel = FALSE,
  n_child = floor(parallelly::availableCores()/3),
  verbose = FALSE
)

Arguments

d

A data frame containing exposure, outcome, and (optionally) follow-up time and covariates.

x

Character string. Column name of the exposure variable (expected 0/1).

y

Character string. Column name of the outcome variable (0/1; event indicator for survival).

y_t

Character string. Optional column name of follow-up time for time-to-event analyses. Ideally this is in years, as crude incidence rates (per 1,000 person-years) are also returned. If not provided then treats y as a binary outcome and estimates AFE/PAF. default=NULL (character)

z

A string. Covariate formula additions (e.g., "+age+sex") for variables found in d. If y_t is provided, covariates are included in the Cox model. If y_t is NULL and z != "", covariates are included in a logistic regression and AFE/PAF are estimated by standardisation (g-formula). default="" (character)

n_boot

Integer. Number of bootstrap replicates for adjusted (logistic) binary-outcome AFE/PAF. Only used when y_t is NULL and z != "". If 0 then no bootstrap CIs are computed. default=0 (integer)

skip_boot

Logical. If regression is not significant then skip bootstrapping for PAF CIs default=TRUE

use_parallel

Logical. Use parallel processing for bootstraps? default=FALSE

n_child

Numeric. Number of child processes to create for parallel processing. Default is a fraction of the total cores available to avoid crashing cloud instances due to RAM limits. default=(total cores available)/3

verbose

Logical. Be verbose, default=FALSE.

Value

A data frame containing AFE and PAF (as proportions and percentages) with 95% CIs and p-values, plus supporting statistics (counts, ORs, excess cases in the exposed, etc.).

Details

Coding assumptions:

  1. x should be coded 0/1 (unexposed/exposed).

  2. For the binary-outcome case, y should be 0/1.

  3. For the time-to-event case, y is the event indicator (0/1) and y_t is follow-up time (in years if you want rates per 1,000 person-years).

Author

Luke Pilling

Examples


# Binary exposure
example_data$hypertension <- dplyr::if_else(example_data$sbp>=140, 1, 0)

 
# Binary outcome (prevalence / risk) crude
res1 <- paf(
  d = example_data,
  x = "hypertension",
  y = "event"
)
#>  yodr v1.1.1

# Binary outcome adjusted
res1_adj <- paf(
  d = example_data,
  x = "hypertension",
  y = "event",
  z = "age+sex",
  n_boot = 0
)
#>  yodr v1.1.1

# Binary outcome adjusted
res1_adj_cis <- paf(
  d = example_data,
  x = "hypertension",
  y = "event",
  z = "age+sex",
  n_boot = 100
)
#>  yodr v1.1.1
#> → Bootstrapping adjusted AFE/PAF CIs with 100 iterations (can take a while)

#rbind(res1, res1_adj, res1_adj_cis)

# Time-to-event outcome (incidence)
res2 <- paf(
  d = example_data,
  x = "hypertension",
  y = "event",
  y_t = "time",
  z = "age+sex"
)
#>  yodr v1.1.1