Evidence Synthesis for Decision Making in Healthcare

This post is based on the textbook Evidence Synthesis for Decision Making in Healthcare (ESDMH) by Nicky J. Welton, Alexander J. Sutton, Nicola J. Cooper, Keith R. Abrams, and A.E. Ades. This textbook is an exemplary presentation of healthcare decision analysis and Bayesian modeling. The only impediment to its aging well and enjoying a long shelf life that we perceive is that all of code was done in WinBugs, pioneering but now obsolete software for evaluating Bayesian MCMC models. Below, we present a JAGS version of the WinBugs model in Example 2.1 on page 26 of the text. We hope that this post will be useful to readers who would like to work through the examples in the text using tools that are easily accessible.

The JAGS Workflow

We will follow the workflow for building JAGS models as is presented in the Introduction to jagsUI vignette, a JAGS workflow comprises the following steps:

Organize data into a named list
Write model file in the BUGS language
Specify initial MCMC values (optional)
Specify which parameters to save posteriors for analysis
Specify MCMC settings
Run JAGS
Examine output

The Blockers Model

We will run the random effects model presented in Example 2.1 using the jagsUI and rjags packages that accept code using the WinBUGS modeling language and calls the JAGS Bayesian engine to evaluate the model. The packages required are loaded here.

Show the code

library('rjags')
library('coda')
library('jagshelper')
library('jagsUI')
library('mcmcplots')  # caterplot
library('dplyr')
library('readxl')
library('jagsUI') # calls rjags

The Data

The data comprises empirical log odds ratios, $Y_{j}$ : $Y_{j} = \log (\frac{δ_{j}}{1 - δ_{j}})$

along with their associated sample variances, $V_{j}$ for twenty-two randomized clinical trials where patients who suffered a myocardial infarction were treated either with beta-blockers or a placebo. The Excel file containing the data, along with the original WinBUGS code, is available here.

Here, we read the data from a local Excel file. Note that the Blocker Excel file is among the supporting materials that can be downloaded from the Wiley site.

Show the code

BLOCKER <- read_excel("BLOCKER.xls") |>
  mutate(trial = 1:length(Y))

head(BLOCKER, 3)

# A tibble: 3 × 3
        Y     V trial
    <dbl> <dbl> <int>
1  0.0282 0.723     1
2 -0.741  0.233     2
3 -0.541  0.319     3

The JAGS Model

Next, we organize the data, which must be structured as a list for JAGS.

Show the code

data <- BLOCKER |> select(-trial) |>
  as.list()
data['Nstud'] <- nrow(BLOCKER)

Specify the JAGS Model

The following code specifies the random effects JAGS model. The study specific log odds ratios, $δ_{j}$ , are assumed to be samples from a common random distribution: $δ_{j} \sim N (d . τ^{2})$ where d is is the population mean of log odds ratios and $τ^{2}$ is the between-studies variance. We assume a flat uniform prior distribution for $t a u$ .

See the JAGS manual for details.

Show the code

# Model
model_code <- "model {
  # Likelihood
  for (j in 1:Nstud) {
    P[j] <- 1/V[j]      # Calculate precision
    Y[j] ~ dnorm(delta[j],P[j])
    delta[j] ~ dnorm(d,prec)
  }

  # Priors
  d ~ dnorm(0,1.0E-6)
  prec <- 1/tau.sq
  tau.sq <- tau*tau   # between-study variance
  tau ~ dunif(0,10)   # Uniform on SD
}" %>% textConnection

# Note that model_code is a string and the textConnection function is used 
# to pass this string to the jags() function further down in the code.

This code initializes the parameter values. Note that we use a function to specify reasonable distributions from which multiple MCMC chains will be initialized.

Show the code

# Note that the WinBugs code in the book initializes the MCMC algorithm with a single chain as follows.
#initial_values <- list(list(
# delta=c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),d=0,tau=1))

# We use this code  which initializes the MCMC algorithm with three chains 
# whose values are drawn from probability distributions.
set.seed(9999)
inits  <- function() {
  list(
    delta = rnorm(22, 0, 0.5),
    d = rnorm(1, 0, 1),
    tau = runif(1, 0, 3)
  )
}

This next block of code specifies the MCMC model. The key parameters are:

n.chains: The umber of MCMC chains to run
n.adapt: Number of iterations to run in the JAGS adaptive phase. See Andrieu & Thoms (2008) for a discussion of adaptive MCMC
n.iter : Number of iterations per chain, including burn-in
n.burnin : Number of iterations to discard as burn-in
n.thin : The thinning rate: every kth sample will be discarded to reduce autocorrelation. See Link & Eaton (2012) for a discussion of thinning.

Examine the Output

The posterior distributions for the model parameters are the main results of the model. jags samples these distributions and computes the posterior mean, standard error, credible intervals, and diagnostic statistics for the parameters d, $t a u$ , four of the $δ_{j}$ parameters, and the deviance. The values shown here are very close to those reported in the text. (Note, for purposes of presentation, we display only four values of the $d e l t a_{j}$ parameters.)

Show the code

options(width = 999)
res <- out_initial
h_res <- head(res$summary, 4)
t_res <- tail(res$summary, 4)
signif(rbind(h_res, t_res), 3)

            mean     sd     2.5%     25%    50%    75%   97.5% Rhat n.eff overlap0     f
d         -0.247 0.0660 -0.37200 -0.2900 -0.248 -0.203 -0.1130 1.00  1000        0 1.000
tau        0.132 0.0811  0.00849  0.0674  0.125  0.184  0.3080 1.01   405        0 1.000
delta[1]  -0.239 0.1660 -0.57600 -0.3270 -0.244 -0.156  0.1220 1.00  3930        1 0.930
delta[2]  -0.288 0.1580 -0.64600 -0.3670 -0.277 -0.196  0.0112 1.00 15000        1 0.971
delta[20] -0.240 0.1300 -0.50100 -0.3200 -0.243 -0.167  0.0403 1.00 10800        1 0.959
delta[21] -0.325 0.1410 -0.65300 -0.4020 -0.309 -0.231 -0.0831 1.00 11900        0 0.994
delta[22] -0.319 0.1430 -0.65000 -0.3930 -0.302 -0.225 -0.0779 1.00 15000        0 0.993
deviance   7.810 4.5800 -1.22000  4.5100  8.050 11.300 16.0000 1.00  1950        1 0.952

Diagnostic Statistics

The diagnostic statistics are interpreted as follows:

Rhat, the Gelman-Rubin Statistic, is a diagnostic that compares the variance within chains to the variance between chains. Values close to 1 (typically less than 1.1) indicate that the chains have converged.
n.eff: provides an estimate of how many independent samples the samples in the chain are equivalent to. A higher number suggests more reliable estimates.
overlap0 = 0 indicates that the 95% credible interval does not include 0, suggesting a statistically significant effect.
f is the proportion of the posterior with the same sign as the mean.

Additionally, the jagsfunction returns two alternative penalized deviance statistics, The deviance information criterion, DIC, and the penalized expected deviance, pD, which are generated via the dic.samples() function. Both of these statistics penalize model complexity: smaller is better. For this model DIC = 18.3016 and pD = 10.4929

As calculated by jags(), DIC is an approximation to the penalized deviance used when only point estimators are available, i.e., $D (θ) = - 2 \log (L (θ))$ where $L (θ)$ is the likelihood of the model given the data . The approximation holds asymptotically when the effective number of parameters is much smaller than the sample size and when the posterior distributions of the model parameters are normally distributed. See page 6 of the rjags pdf for details and Bayesian measures of model complexity and fit, by Spiegelhalter et al. (2002) for the theory.

For background on pD, see Penalized loss functions for Bayesian model comparison, Plummer (2008).

Show the code

dev <- data.frame(x = out_initial$pD, y = out_initial$DIC)
names(dev) <- c("pD", "DIC")
dev

        pD      DIC
1 10.49292 18.30155

Note that all of the output we have been discussing can be generated at once just by printing the model output object: out_initial.

Density Plots

The density plots show the posterior distribution of the estimated parameters: the posterior distribution of the population mean, $d$ , the between-study variance, $τ^{2}$ , the study-specific log odds ratios, $δ_{j}$ , and the deviance. We only show the first density plot here, but the sharp peak is representative of all of the $δ_{j}$ plots. (Note that an easy modification of the code will plot them all.)

Show the code

jagsUI::densityplot(
  out_initial,
  parameters = c("d", "tau", "deviance", "delta[1]"),
  layout = c(2, 2)
)

Whisker Plot

The following plot shows the posterior mean for and 95% credible interval for the population mean, $d$ , and the study-specific log odds ratios, $δ_{j}$ .

Show the code

colnames <- c("d", sprintf("delta[%d]", 1:22))
caterplot(out_initial$samples[, colnames])

MCMC Diagnostics

A key MCMC diagnostic reported by the jags() function of the jagsUI package is sufficient.adapt, a logical value indicating whether the number of iterations for adaptation was sufficient. For this model:

Show the code

cat("Sufficient Adaptation =",
    out_initial$mcmc.info$sufficient.adapt,
    "\n")

Sufficient Adaptation = TRUE

Trace Plots

The trace plots show the evolution of the three Markov Chains (each in a different color) for all of the estimated variables. The chains appear to be mixing well. However, it is noteworthy that some of the blue chains seem to get stuck for a while, as indicated by the short, flat segments. The plot of the $δ_{1}$ parameter is representative of the behavior of the other delta parameters. You can check this yourself by just using “delta” in the plot command below to plot all of the $δ_{j}$ parameters.

In addition to the model parameters, the traceplot() function also displays the MCMC trace for the deviance, a statistical measure that indicates how well the model fits the data.

Show the code

jagsUI::traceplot(
  out_initial,
  parameters = c("d", "tau", "deviance", "delta[1]"),
  layout = c(2, 2)
)

Some Closing Remarks

We think that the point of view taken in ESDMH is that Bayesian models, which integrate statistical estimations of clinical outcomes with economic considerations, are an elegant and powerful approach to decision-making in healthcare. This post is just a first step towards showing how ESDMH develops this process. In posts to come, we hope to present more examples from the text.

Bob Horton started his career as a molecular biologist, studying genes involved in immune responses (MHC evolution and TCR repertoire), and developing genetic engineering techniques. Analyzing and simulating biological data led him to data science and his current interests, which include semantic searching of text data and decision modeling.