Estimate attributable fraction in the exposed (AFE) and population attributable fraction (PAF)
Source:R/paf.R
paf.RdComputes the attributable fraction in the exposed (AFE) and the population attributable fraction (PAF) with 95% Confidence Intervals (95% CIs) for a binary exposure (coded 0/1) and either:
a binary outcome (prevalence / risk) when
y_tisNULL, ora time-to-event outcome (incidence) using a Cox proportional hazards model when
y_tis provided.
Binary-outcome case:
A logistic regression is fitted and AFE/PAF are estimated by standardisation (g-formula): mean predicted risks under observed exposure and counterfactual exposure assignments
If
z != ""(i.e., covariates provided) then the logistic regression is adjusted. Ifn_boot > 0, non-parametric bootstrap 95% CIs are returned for AFE/PAF.
Time-to-event case: A Cox model is fitted and the hazard ratio (HR) for exposure is used as the effect measure. Wald 95% confidence intervals are taken from the Cox model, then transformed to AFE and PAF. Crude incidence rates (per 1,000 person-years) are also returned by exposure group (assumes time is in years).
Usage
paf(
d,
x,
y,
y_t = NULL,
z = "",
n_boot = 0L,
skip_boot = TRUE,
use_parallel = FALSE,
n_child = floor(parallelly::availableCores()/3),
verbose = FALSE
)Arguments
- d
A data frame containing exposure, outcome, and (optionally) follow-up time and covariates.
- x
Character string. Column name of the exposure variable (expected 0/1).
- y
Character string. Column name of the outcome variable (0/1; event indicator for survival).
- y_t
Character string. Optional column name of follow-up time for time-to-event analyses. Ideally this is in years, as crude incidence rates (per 1,000 person-years) are also returned. If not provided then treats
yas a binary outcome and estimates AFE/PAF.default=NULL(character)- z
A string. Covariate formula additions (e.g., "+age+sex") for variables found in
d. Ify_tis provided, covariates are included in the Cox model. Ify_tisNULLandz != "", covariates are included in a logistic regression and AFE/PAF are estimated by standardisation (g-formula).default=""(character)- n_boot
Integer. Number of bootstrap replicates for adjusted (logistic) binary-outcome AFE/PAF. Only used when
y_tisNULLandz != "". If 0 then no bootstrap CIs are computed.default=0(integer)- skip_boot
Logical. If regression is not significant then skip bootstrapping for PAF CIs
default=TRUE- use_parallel
Logical. Use parallel processing for bootstraps?
default=FALSE- n_child
Numeric. Number of child processes to create for parallel processing. Default is a fraction of the total cores available to avoid crashing cloud instances due to RAM limits.
default=(total cores available)/3- verbose
Logical. Be verbose,
default=FALSE.
Value
A data frame containing AFE and PAF (as proportions and percentages) with 95% CIs and p-values, plus supporting statistics (counts, ORs, excess cases in the exposed, etc.).
Details
Coding assumptions:
xshould be coded 0/1 (unexposed/exposed).For the binary-outcome case,
yshould be 0/1.For the time-to-event case,
yis the event indicator (0/1) andy_tis follow-up time (in years if you want rates per 1,000 person-years).
Examples
# Binary exposure
example_data$hypertension <- dplyr::if_else(example_data$sbp>=140, 1, 0)
# Binary outcome (prevalence / risk) crude
res1 <- paf(
d = example_data,
x = "hypertension",
y = "event"
)
#> ℹ yodr v1.1.1
# Binary outcome adjusted
res1_adj <- paf(
d = example_data,
x = "hypertension",
y = "event",
z = "age+sex",
n_boot = 0
)
#> ℹ yodr v1.1.1
# Binary outcome adjusted
res1_adj_cis <- paf(
d = example_data,
x = "hypertension",
y = "event",
z = "age+sex",
n_boot = 100
)
#> ℹ yodr v1.1.1
#> → Bootstrapping adjusted AFE/PAF CIs with 100 iterations (can take a while)
#rbind(res1, res1_adj, res1_adj_cis)
# Time-to-event outcome (incidence)
res2 <- paf(
d = example_data,
x = "hypertension",
y = "event",
y_t = "time",
z = "age+sex"
)
#> ℹ yodr v1.1.1