## Read in data
<-
my_data read_csv("../../data/401_case1_PKPDdataset_ard.csv") %>%
filter(CYCLE == 1)
## Plot response vs exposure
%>%
my_data ggplot(aes(x = AUC, y = sCHG)) +
geom_point() +
scale_y_continuous(breaks = seq(-800, 800, 200)) +
theme_gray(base_size = 10) +
labs(x = "RESN", y = "LIDV", title = "")
Exploratory exposure-response analysis
This example illustrates the importance of understanding the scientific context when exploring data graphically. An exploratory data analysis is more than just “plotting data”; it can lead to a deeper understanding and inform next steps (Gabry et al. 2019; Gelman 2004). However, like an analysis that is poorly thought through, a poorly implemented graph can also deceive.
Consider an inhaled drug intended to improve lung function, with the target site of action in the lung. The drug is also absorbed systemically from the lung. Suppose that the team wants to fine-tune the choice of a recommended dose. A typical starting point for this question is often a plot of the response variable of interest against a summary measure of plasma concentration (e.g. the area under the concentration time curve, AUC). Figure 1 shows such a plot, generated using the default settings of the R package ggplot2.
In terms of good graphical principles, this plot leaves a fair bit to be desired. Several improvements are warranted, including proper axis scaling, gridlines, annotation, font size, etc. One particularly egregious issue is the lack of care in selecting axis labels, leaving programming labels for the plotted variables (presumably only then to make the effort of explaining them in a caption). An improved version is shown in Figure 2, addressing many of these formatting issues. With an added LOESS smoother (Cleveland 1979), we see a positive non-linear trend, suggesting a shallow sigmoidal exposure-response relationship.
%>%
my_data ggplot(aes(x = AUC, y = sCHG)) +
geom_point(alpha = 0.7) +
geom_smooth(method = "loess", colour = "red") +
scale_x_log10(breaks = lbr, labels = llb) +
scale_y_continuous(breaks = seq(-800, 800, 200)) +
annotation_logticks(sides = "b") +
labs(
x = expression(paste("AUC0-24h (h*",mu,"g/mL)", sep = "")),
y = "FEV1 change from baseline (mL)",
title = "Exposure is positively associated with response",
subtitle = "Loess smoother (95% CI)"
+
) paper_theme()
It is tempting, especially when presented with a suboptimal graph, to immediately set about fixing the various graphical imperfections and produce a more appropriate and visually appealing version of the same graph. This is an example of selective attention, focusing on the detail but overlooking the higher purpose of the task (i.e. the “why”). Instead, let us now take a step back and revisit this example in the context of the first law of visual communication: have a clear purpose.
Why are we conducting an exposure-response analysis? Recall that the scientific interest is to fine-tune the dose, and that the drug is inhaled and acting locally in the lung. The implicit assumption of an exposure-response analysis is one of causality. Here, however, plasma concentration is unlikely to be on the causal path from dose to response. What would be a better way to address the scientific question of interest?
%>%
my_data ggplot(aes(x = AUC, y = sCHG, colour = factor(DOSE))) +
geom_point(alpha = 0.5) +
geom_smooth(method = "lm", se = FALSE) +
scale_colour_brewer(palette = "Set2" , name = "Dose (mg)") +
scale_x_log10(breaks = lbr, labels = llb) +
scale_y_continuous(breaks = seq(-800, 800, 200)) +
annotation_logticks(sides = "b") +
labs(
x = expression(paste("AUC0-24h (h*", mu, "g/mL)", sep = "")),
y = "FEV1 change from baseline (mL)",
title = "Exposure is not a better predictor of response than dose") +
paper_theme() +
theme(
legend.position = c("right"),
legend.title = element_text(size = 10)
)
Consider Figure 3, where instead of estimating an overall trend we now look at the trends within dose. Clearly, any apparent trends within dose do not follow a consistent pattern across doses. The only reason why exposure and response appeared associated in the previous two plots is that they share a common cause, namely dose. In other words, dose is a confounder in those plots, and indeed dose is a better predictor of response than systemic concentration. We should build a dose-response model, rather than an exposure-response model, and choose a recommended dose based on this (and any information on safety and tolerability).
sessionInfo()
## R version 4.3.0 (2023-04-21)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.6.3
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Europe/Zurich
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets
## [6] methods base
##
## other attached packages:
## [1] RColorBrewer_1.1-3 lubridate_1.9.2
## [3] forcats_1.0.0 stringr_1.5.0
## [5] dplyr_1.1.2 purrr_1.0.1
## [7] readr_2.1.4 tidyr_1.3.0
## [9] tibble_3.2.1 ggplot2_3.4.2
## [11] tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.3 generics_0.1.3 lattice_0.21-8
## [4] stringi_1.7.12 hms_1.1.3 digest_0.6.31
## [7] magrittr_2.0.3 evaluate_0.21 grid_4.3.0
## [10] timechange_0.2.0 fastmap_1.1.1 Matrix_1.5-4
## [13] jsonlite_1.8.4 mgcv_1.8-42 fansi_1.0.4
## [16] scales_1.2.1 cli_3.6.1 rlang_1.1.1
## [19] crayon_1.5.2 splines_4.3.0 bit64_4.0.5
## [22] munsell_0.5.0 withr_2.5.0 yaml_2.3.7
## [25] tools_4.3.0 parallel_4.3.0 tzdb_0.4.0
## [28] colorspace_2.1-0 vctrs_0.6.2 R6_2.5.1
## [31] lifecycle_1.0.3 htmlwidgets_1.6.2 bit_4.0.5
## [34] vroom_1.6.3 pkgconfig_2.0.3 pillar_1.9.0
## [37] gtable_0.3.3 glue_1.6.2 xfun_0.39
## [40] tidyselect_1.2.0 rstudioapi_0.14 knitr_1.43
## [43] farver_2.1.1 nlme_3.1-162 htmltools_0.5.5
## [46] rmarkdown_2.22 labeling_0.4.2 compiler_4.3.0