Use user-provided list of genetic variants to extract from imputed BGEN files (field 22828) or WGS DRAGEN BGEN files (field 24309) data and load as data.frame
If selecting the DRAGEN data as the source, this assumes your project has access to the WGS BGEN files released April 2025. If not, run `ukbrapR:::make_dragen_bed_from_pvcfs()` to use [tabix] and [plink] to subset the [DRAGEN WGS pVCF files].
Usage
extract_variants(
in_file,
out_bed = "tmp",
source = "imputed",
overwrite = FALSE,
progress = FALSE,
verbose = FALSE,
very_verbose = FALSE
)
Arguments
- in_file
A data frame or file path. Contains rsid, chr, and pos. For imputed genos pos is build 37. For DRAGEN pos is build 38. Other columns are ignored.
- out_bed
A string. Prefix for output files (optional)
default="tmp"
- source
A string. Either "imputed" or "dragen" - indicating whether the variants should be from "UKB imputation from genotype" (field 22828) or "DRAGEN population level WGS variants, PLINK format [500k release]" (field 24308)
default="imputed"
- overwrite
Logical. Overwrite output BED files? (If output prefix is left as 'tmp' overwrite is set to TRUE),
default=FALSE
- progress
Logical. Show progress through each individual file,
default=FALSE
- verbose
Logical. Be verbose (show individual steps),
default=FALSE
- very_verbose
Logical. Be very verbose (show individual steps & show terminal output from Plink etc),
default=FALSE
Examples
liver_variants <- extract_variants(in_file=system.file("files", "pgs_liver_cirrhosis.txt", package="ukbrapR"), out_bed="liver_cirrhosis.imputed.variants")