library(tidyverse)
theme_set(theme_bw())
library(yspec)
library(mrgsolve)
library(mrggsave)
library(vpc)
library(glue)
library(bbr)
library(here)
1 Introduction
This document illustrates the mechanics of using the vpc
package along with mrgsolve
to create Visual Predictive Check plots.
2 Tools used
2.1 MetrumRG packages
mrgsolve Simulate from ODE-based population PK/PD and QSP models in R.
yspec Data specification, wrangling, and documentation for pharmacometrics.
bbr Manage, track, and report modeling activities, through simple R objects, with a user-friendly interface between R and NONMEM®.
mrggsave Label, arrange, and save annotated plots and figures.
2.2 CRAN packages
vpc Create visual predictive checks.
3 Outline
- First the we load the analysis data set that was used in the NONMEM® estimation
- Next, we’ll load an mrgsolve translation of the NONMEM® model and validate the coding against outputs from the NONMEM® run
- After simulating data replicates from the mrgsolve model, we’ll construct four visual predictive checks
- Basic VPC on dose-normalized concentrations
- VPC for a single dose, stratified by a covariate
- VPC on fraction below the quantitative limit versus
- Prediction-corrected VPC
4 Required packages
All packages were installed from mpn via pkgr.
These are some global options which reflect the directory structure associated with this project
options(mrggsave.dir = tempdir(), mrg.script = "pk-vpc-final.Rmd")
options(mrgsolve.project = here("script/model"))
This is a theme that we’ll use to style the plots coming out of vpc
.
= new_vpc_theme(list(
mrg_vpc_theme sim_pi_fill = "steelblue3", sim_pi_alpha = 0.5,
sim_median_fill = "grey60", sim_median_alpha = 0.5
))
5 Setup
5.1 Data
First, we have some observations that are below the limit of quantification that need to be dealt with.
The final model estimation run was 106, so we read back those results using bbr::nm_join()
. You can read more about using bbr
for model management here Model Management.
<- 106
runno <- nm_join(here("model/pk/106"), .superset = TRUE) data
We want to load in the complete data, including observations that were ignored in the estimation run because they were below the quantitation limit (BQL). To do this, we pass .superset = TRUE
to nm_join()
. The result is the full model estimation data set along with all items that were tabled from that run. Be sure to also drop any rows that were excluded from the model estimation run for reasons other than being below the quantitation limit.
<- filter(data, is.na(C))
data
head(data)
# A tibble: 6 × 49
C NUM ID TIME SEQ CMT EVID AMT DV.DATA AGE WT HT
<lgl> <dbl> <int> <dbl> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 NA 1 1 0 0 1 1 5 0 28.0 55.2 160.
2 NA 2 1 0.61 1 2 0 NA 61.0 28.0 55.2 160.
3 NA 3 1 1.15 1 2 0 NA 91.0 28.0 55.2 160.
4 NA 4 1 1.73 1 2 0 NA 122. 28.0 55.2 160.
5 NA 5 1 2.15 1 2 0 NA 126. 28.0 55.2 160.
6 NA 6 1 3.19 1 2 0 NA 84.7 28.0 55.2 160.
# ℹ 37 more variables: EGFR <dbl>, ALB <dbl>, BMI <dbl>, SEX <int>, AAG <dbl>,
# SCR <dbl>, AST <dbl>, ALT <dbl>, CP <int>, TAFD <dbl>, TAD <dbl>,
# LDOS <int>, MDV <int>, BLQ <int>, PHASE <int>, STUDYN <int>, DOSE <int>,
# SUBJ <int>, USUBJID <chr>, STUDY <chr>, ACTARM <chr>, RF <chr>,
# IPRED <dbl>, NPDE <dbl>, CWRES <dbl>, CL <dbl>, V2 <dbl>, Q <dbl>,
# V3 <dbl>, KA <dbl>, ETA1 <dbl>, ETA2 <dbl>, ETA3 <dbl>, DV <dbl>,
# PRED <dbl>, RES <dbl>, WRES <dbl>
We will use the data specification object that corresponds with the model estimation data (in this case pk
)
<- ys_load(here("data/derived/pk.yml"))
spec <- ys_namespace(spec, "long") spec
The function ys_get_short_unit()
will pull the “short” name from each column name and append the unit if available
<- ys_get_short_unit(spec, parens = TRUE, title_case = TRUE) lab
The function ys_add_factors()
will look at spec
for discrete data items and append them to the data set as factors
<- ys_add_factors(data, spec) data
For example
count(data, CP, CP_f)
# A tibble: 4 × 3
CP CP_f n
<int> <fct> <int>
1 0 CP score: 0 3880
2 1 CP score: 1 160
3 2 CP score: 2 160
4 3 CP score: 3 160
For convenience we will set aliases for a couple of these columns
<- mutate(data, Hepatic = CP_f, Renal = RF_f) data
5.2 The simulation model
This should reflect the nonmem model model/pk/106.ctl
<- mread(glue("{runno}.mod")) mod
5.2.1 Validate simulation model
We can validate our coding of the model, looking at PRED
returned from NONMEM® as compared to PRED
returned from simulation from the model.
We already have PRED
in the data
select(data, ID, TIME, DV, IPRED, PRED)
# A tibble: 4,360 × 5
ID TIME DV IPRED PRED
<int> <dbl> <dbl> <dbl> <dbl>
1 1 0 0 0 0
2 1 0.61 61.0 68.5 60.6
3 1 1.15 91.0 90.8 78.5
4 1 1.73 122. 97.3 83.0
5 1 2.15 126. 96.7 82.2
6 1 3.19 84.7 88.7 75.5
7 1 4.21 62.1 78.9 67.8
8 1 5.09 49.1 71.1 61.7
9 1 6.22 64.2 62.4 54.8
10 1 8.09 59.6 51.0 45.7
# ℹ 4,350 more rows
We’ll modify the mrgsolve model object to return PRED
when we simulate
<- zero_re(mod)
pmod
<- mrgsim(pmod, data = data, recover = "PRED,EVID", digits = 5)
check
head(check) %>% select(ID, TIME, Y, PRED)
ID TIME Y PRED
1 1 0.00 0.000 0.000
2 1 0.61 60.583 60.583
3 1 1.15 78.548 78.548
4 1 1.73 83.040 83.040
5 1 2.15 82.222 82.222
6 1 3.19 75.487 75.487
Keeping only observations
<- filter(check, EVID==0) check
we do a quick summary and show that all 3210 PRED
values are matching between the mrgsolve model and NONMEM® output
nrow(check)
[1] 3210
summary(check$Y - check$PRED)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 0 0 0 0 0 68
This demonstrates that the simulation model can reproduce the population predictions from the estimation model and is reliable for use to simulate the predictive check.
6 VPC on dose-normalized concentrations
In this example, we will normalize the observed and simulated data from SAD and MAD studies (Studies 1 and 2) and present all doses in the same plot.
Studies 1 and 2 were single- and multiple-ascending dose studies with healthy volunteers
count(data, STUDY, STUDYN, CP_f, RF_f)
# A tibble: 10 × 5
STUDY STUDYN CP_f RF_f n
<chr> <int> <fct> <fct> <int>
1 101-DEMO-001 1 CP score: 0 No impairment 480
2 101-DEMO-002 2 CP score: 0 No impairment 1800
3 201-DEMO-003 3 CP score: 0 No impairment 360
4 201-DEMO-003 3 CP score: 0 Mild impairment 360
5 201-DEMO-003 3 CP score: 0 Moderate impairment 360
6 201-DEMO-003 3 CP score: 0 Severe impairment 360
7 201-DEMO-004 4 CP score: 0 No impairment 160
8 201-DEMO-004 4 CP score: 1 No impairment 160
9 201-DEMO-004 4 CP score: 2 No impairment 160
10 201-DEMO-004 4 CP score: 3 No impairment 160
<- filter(data, STUDYN <= 2) sad_mad
6.1 The simulation function
Create a function to simulate out one replicate and re-use it for all the different simulations we will use for the predictive checks.
<- function(rep, data, model,
sim recover = "EVID,DOSE,STUDY,Renal,Hepatic") {
mrgsim(
model, data = data,
carry_out = "NUM",
recover = recover,
Req = "Y",
output = "df"
%>% mutate(irep = rep)
) }
Arguments
rep
the current replicate numberdata
the data template for simulationmodel
the mrgsolve model objectrecover
columns to carry from the data into the output
Return
A data frame with the replicate number appended as irep
.
6.2 Simulate for VPC
We’ll simulate only 100 replicates here to save some computation time; we would typically simulate more in a production context.
<- seq(100)
isim
set.seed(86486)
<- lapply(
sims
isim, sim, data = sad_mad,
mod = mod
%>% bind_rows() )
Filter both the observed and simulated data on
- actual observations that are not BQL
- simulated observations that are above the limit of quantitation (here 10 ng/mL)
<- filter(sad_mad, EVID==0, BLQ ==0)
fsad_mad <- filter(sims, EVID==0, Y >= 10) fsims
Also we are going to dose-normalize observed and simulated data so we can put all the doses on the same plot
<- mutate(fsad_mad, DVN = DV/DOSE)
fsad_mad <- mutate(fsims, YN = Y/DOSE) fsims
6.3 Calculate VPC and plot
Pass observed and simulated data into vpc()
function, noting that we want this stratified by STUDY
so that SAD and MAD regimens are plotted separately
<- vpc(
p1 obs = fsad_mad,
sim = fsims,
stratify = "STUDY",
obs_cols = list(dv = "DVN"),
sim_cols=list(dv = "YN", sim = "irep"),
log_y = TRUE,
pi = c(0.05, 0.95),
ci = c(0.025, 0.975),
facet = "columns",
show = list(obs_dv = TRUE),
vpc_theme = mrg_vpc_theme
)
Notes
- We want this vpc to show dose-normalized concentrations, so we change the name of
dv
underobs_cols
andsim_cols
- Setting
pi
toc(0.05, 0.95)
indicates that the vpc will show the 5th and 95th percentiles of observed and simulated data in addition to the median - Setting
ci
toc(0.025, 0.975)
indicates that we want to display 95% prediction intervals around the simulated statistics (median, 5th and 95th percentiles - The
show
argument indicates that we want observed data points on the plot
<-
p1 +
p1 xlab(lab$TIME) +
ylab("Dose-normalized concentration (ng/mL)")
p1
mrggsave_last(stem = "pk-vpc-{runno}-dose-norm", height = 7.5)
7 VPC stratified on a covariate
In this example, we’ll take a subset of the data from the renal study (Study 3) and stratify the VPC on the Renal Function covariate
<- filter(data, STUDYN==3) rf_data
set.seed(54321)
<- lapply(
rf_sims
isim, sim, data = rf_data,
mod = mod
%>% bind_rows() )
Filtering both the observed and simulated data; there was only a single dose administered in this study
<- filter(rf_data, EVID==0, BLQ ==0)
f_rf_data <- filter(rf_sims, EVID==0, Y >= 10) f_rf_sims
Calculate the vpc and plot
<- vpc(
p2 obs = f_rf_data,
sim = f_rf_sims,
stratify = "Renal",
obs_cols = list(dv = "DV"),
sim_cols=list(dv = "Y", sim = "irep"),
log_y = TRUE,
pi = c(0.05, 0.95),
ci = c(0.025, 0.975),
show = list(obs_dv = TRUE),
vpc_theme = mrg_vpc_theme
)
<- p2 + xlab(lab$TIME) + ylab("Concentration (ng/mL)")
p2
p2
mrggsave_last(stem = "pk-vpc-{runno}-rf", height = 6.5)
We can also run vpc stratified on hepatic function
Show the code
<- filter(data, STUDYN==4)
cp_data
set.seed(7456874)
<- lapply(
cp_sims
isim,
sim, data = cp_data,
mod = mod
%>% bind_rows()
)
<- filter(cp_data, EVID==0, BLQ ==0)
f_cp_data <- filter(cp_sims, EVID==0, Y >= 10)
f_cp_sims
<- vpc(
p3 obs = f_cp_data,
sim = f_cp_sims,
stratify = "Hepatic",
obs_cols = list(dv = "DV"),
sim_cols=list(dv="Y", sim="irep"),
log_y = TRUE,
pi = c(0.05, 0.95),
ci = c(0.025, 0.975),
labeller = label_value,
show = list(obs_dv = TRUE),
vpc_theme = mrg_vpc_theme
)
<-
p3 +
p3 xlab(lab$TIME) +
ylab("Concentration (ng/mL)")
p3
mrggsave(p3, stem = "pk-vpc-{runno}-cp", height = 6.5)
8 VPC on fraction BLQ
This is a predictive check on the fraction of observations that are below the limit of quantification versus time. We’ll take just the single dose studies and limit the analysis out to 100 hours after the dose
<- nm_join(here("model/pk/106"), .superset = TRUE)
data <- arrange(data, NUM)
data <- filter(data, STUDYN %in% c(1,3), TIME <= 100) data
Simulate using the sim()
function as we have done before
set.seed(7456874)
<- lapply(
b_sims
isim,
sim, data = data,
mod = mod
%>% bind_rows() )
Now, we use the vpc_cens()
function from the vpc
package and pass in the lower limit of quantification (lloq
)
<- vpc_cens(
p4 sim = b_sims,
obs = data,
lloq = 10,
obs_cols = list(dv = "DV"),
sim_cols = list(dv = "Y", sim = "irep")
)
<- p4 + xlab(lab$TIME)
p4
p4
mrggsave_last(stem = "pk-vpc-{runno}-BLQ", height = 6.5)
9 Prediction-corrected VPC
This VPC method removes variability in a time bin by “correcting” (or normalizing) observed and simulated data with the population predicted value (PRED
). Recall that we used bbr::nm_join()
to read in the analysis data set along with all the table output joined on. We’ll filter out the rows that were commented going into the analysis, but retain the rows that were ignored because they were BQL.
<- nm_join(here("model/pk/106"), .superset = TRUE)
data <- filter(data, is.na(C)) data
This data set includes PRED
as generated by NONMEM, but we don’t have those values for rows where the observations were BQL. We’ll use mrgsolve to simulate PRED
for every row in the complete data set.
<- mrgsim(zero_re(mod), data) out
Then, we’ll copy that value into the simulation data set for use in the prediction-corrected VPC. We’ll save the NONMEM version just as a check that the simulation went as planned.
<- mutate(data, NMPRED = PRED, PRED = out$Y)
data
select(data, ID, TIME, DV, PRED, NMPRED, IPRED)
# A tibble: 4,360 × 6
ID TIME DV PRED NMPRED IPRED
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0 0 0 0 0
2 1 0.61 61.0 60.6 60.6 68.5
3 1 1.15 91.0 78.5 78.5 90.8
4 1 1.73 122. 83.0 83.0 97.3
5 1 2.15 126. 82.2 82.2 96.7
6 1 3.19 84.7 75.5 75.5 88.7
7 1 4.21 62.1 67.8 67.8 78.9
8 1 5.09 49.1 61.7 61.7 71.1
9 1 6.22 64.2 54.9 54.8 62.4
10 1 8.09 59.6 45.7 45.7 51.0
# ℹ 4,350 more rows
Taking just single dose studies (1 and 4)
<- filter(data, STUDYN %in% c(1,4)) single
<- lapply(
out seq(200),
FUN = sim,
data = single,
model = mod,
recover = "PRED,DV,EVID,DOSE"
%>% bind_rows() )
Get simulated and observed data ready so that PRED
is in both the simulated and observed data
<- filter(single, EVID==0, BLQ==0)
observed <- filter(out, EVID==0, Y >= 10)
simulated
head(simulated)
ID TIME NUM Y PRED DV EVID DOSE irep
1 1 0.61 2 57.90601 60.58323 61.005 0 5 1
2 1 1.15 3 111.78109 78.54761 90.976 0 5 1
3 1 1.73 4 128.19958 83.03987 122.210 0 5 1
4 1 2.15 5 99.45877 82.22198 126.090 0 5 1
5 1 3.19 6 64.75010 75.48693 84.682 0 5 1
6 1 4.21 7 104.03219 67.79681 62.131 0 5 1
head(observed)
# A tibble: 6 × 50
C NUM ID TIME SEQ CMT EVID AMT DV.DATA AGE WT HT
<lgl> <dbl> <int> <dbl> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 NA 2 1 0.61 1 2 0 NA 61.0 28.0 55.2 160.
2 NA 3 1 1.15 1 2 0 NA 91.0 28.0 55.2 160.
3 NA 4 1 1.73 1 2 0 NA 122. 28.0 55.2 160.
4 NA 5 1 2.15 1 2 0 NA 126. 28.0 55.2 160.
5 NA 6 1 3.19 1 2 0 NA 84.7 28.0 55.2 160.
6 NA 7 1 4.21 1 2 0 NA 62.1 28.0 55.2 160.
# ℹ 38 more variables: EGFR <dbl>, ALB <dbl>, BMI <dbl>, SEX <int>, AAG <dbl>,
# SCR <dbl>, AST <dbl>, ALT <dbl>, CP <int>, TAFD <dbl>, TAD <dbl>,
# LDOS <int>, MDV <int>, BLQ <int>, PHASE <int>, STUDYN <int>, DOSE <int>,
# SUBJ <int>, USUBJID <chr>, STUDY <chr>, ACTARM <chr>, RF <chr>,
# IPRED <dbl>, NPDE <dbl>, CWRES <dbl>, CL <dbl>, V2 <dbl>, Q <dbl>,
# V3 <dbl>, KA <dbl>, ETA1 <dbl>, ETA2 <dbl>, ETA3 <dbl>, DV <dbl>,
# PRED <dbl>, RES <dbl>, WRES <dbl>, NMPRED <dbl>
Now in the vpc()
call, we can just set pred_corr = TRUE
<- vpc(
p5 obs = observed,
sim = simulated,
obs_cols = list(dv = "DV"),
sim_cols = list(dv = "Y", sim = "irep"),
log_y = TRUE,
pred_corr = TRUE,
pi = c(0.05, 0.95),
ci = c(0.025, 0.975),
labeller = label_value,
show = list(obs_dv = TRUE),
vpc_theme = mrg_vpc_theme
)
<-
p5 +
p5 xlab(lab$TIME) +
ylab("Pred-corrected concentration (ng/mL)")
p5
mrggsave_last(stem = "pk-vpc-{runno}-pc", height = 6.5)
10 Other resources
The following scripts from the Github repository are discussed on this page. If you’re interested running this code, visit the About the Github Repo page first.
PK VPC script: pk-vpc-final.R
PK PCVPC script: pk-pcvpc-final.R