Skip to contents

Categorical fields are exported as integers but are encoded with labels.

For example 20116 “Smoking status”:

Coding Meaning
-3 Prefer not to answer
0 Never
1 Previous
2 Current

This package includes two functions to label a single UK Biobank field or a data frame of them using the UK Biobank encoding schema. Examples:

# update the Smoking status field
ukb <- label_ukb_field(ukb, field="p20116_i0")

table(ukb$p20116_i0)                   # tabulates the values
#>    -3      0      1      2 
#>  2057 273405 172966  52949 

table(haven::as_factor(ukb$p20116_i0)) # tabulates the labels
#> Prefer not to answer                Never             Previous              Current 
#>                 2057               273405               172966                52949

haven::print_labels(ukb$p20116_i0)     # show the value:label mapping for this variable
#> Labels:
#>  value                label
#>     -3 Prefer not to answer
#>      0                Never
#>      1             Previous
#>      2              Current

#
# if you have a whole data frame of exported fields, you can use the wrapper function label_ukb_fields()

# say the `ukb` data frame contains 4 variables: `eid`, `p54_i0`, `p31` and `age_at_assessment` 

# update the variables that looks like UK Biobank fields with titles and, where cateogrical, labels 
# i.e., `p54_i0` and `p31` only -- `eid` and `age_at_assessment` are ignored
ukb <- label_ukb_fields(ukb)

table(ukb$p31)                   # tabulates the values
#>      0      1 
#> 273238 229031 

table(haven::as_factor(ukb$p31)) # tabulates the labels
#> Female   Male 
#> 273238 229031