# 1 Overview

The most commonly used model for cost-effectiveness analysis (CEA) is the cohort discrete time state transition model (cDTSTM), commonly referred to as a Markov cohort model. In this tutorial we demonstrate implementation with R of the simplest of cDSTMs, a time-homogeneous model with transition probabilities that are constant over time. The entire analysis can be run using Base R (i.e., without installing any packages). However, we will use the following packages to create a nice looking cost-effectiveness table.

library("rcea")
library("knitr")
library("kableExtra")
library("magrittr")
library("tibble")

As an example, we will consider the 4-state sick-sicker model that has been described in more detail by Alarid-Escudero et al. The model will be used to compare two treatment strategies, a “New” treatment and the existing “standard of care (SOC)”. The model consists 4 health states. Ordered from worst to best to worst, they are: Healthy (H), Sick (S1), Sicker (S2), and Death (D). Possible transitions from each state are displayed in the figure below. # 2 Theory

## 2.1 Transition probabilities

Transition probabilities are the building blocks of Markov cohort models and used to simulate the probability that a cohort of patients is in each health state at each model cycle. To set notation, let $$p_t$$ be the transition probability matrix governing transitions from one time period (i.e., model cycle) to the next. $$x_t$$ is the state occupancy vector at time $$t$$. In this example, $$p_t$$ is a $$4 \times 4$$ matrix and $$x_t$$ is a $$1 \times 4$$ vector containing the proportion of patients in each of the 4 health states.

At time $$0$$, the state vector is $$x_0 = (1, 0, 0, 0)$$ since all patients are healthy. The state vector at time $$t+1$$ for each of $$T$$ model cycles can be computed using matrix multiplication,

$x_{t+1}= x_t p_t, \quad t = 0,\ldots,T$ The matrix containing state vectors across all model cycles is often referred to as the “Markov trace” in health economics. Note that since we are considering a time-homogeneous model in this example, we can remove the subscript from the transition matrix and set $$p_t = p$$.

## 2.2 Costs and outcome

To perform a CEA, Markov models estimate expected (i.e., average) costs and outcomes for the cohort of interest. In our examples, quality-adjusted life-years (QALYs) will be used as the outcome.

Costs and QALYs are computed by assigning values to the health states (i.e., “state values”) and weighting the time spent in each state by these state values. For example, if utility in each of the four health states is given by the $$4 \times 1$$ vector $$u = (1, 0.75, 0.5, 0)^T$$ and say, $$x_1 = (.8, .1, 0.08, 0.02)$$, then the weighted average utility would be given by $$x_{1}u = 0.915$$. More generally, we can define a state vector at time $$t$$ as $$z_t$$ so that the expected value at time $$t$$ is given by

$EV_t = x_{t}z_{t}$

Future cost and outcomes are discounted using a discount rate. In discrete time, the “discount weight” at time $$t$$ is given by $$1/(1 + r)^t$$ where $$r$$ is the discount rate; in continuous time, it is given by $$e^{-rt}$$. We will use discrete time here, meaning that the present value of expected values are given by,

$PV_t = \frac{EV_t}{(1+r)^t}$ The simplest way to aggregate discounted costs across model cycles is with the net present value,

$NPV = \sum_{t} PV_t$

It is important to note that in mathematical terms this a discrete approximation to a continuous integral, also known as a Riemann sum. That is, we only know that an expected value lies within a time interval (e.g., in model cycle 1, between year 0 and 1; in model cycle 2, between year 1 and 2; and so on). If we approximate the integral using values at the start of the interval (i.e., transitions occur at the end of the interval), then it is known as a left Riemann sum; conversely, if we use values at the end of the interval (i.e., transitions happen immediately), then it is known as a right Riemann sum. Life-years are overestimated when using a left Riemann sum and underestimated when using a right Riemann sum. A preferable option is to use an average of values at the left and right endpoints (i.e., the trapezoid rule). That said, we will assume events occur at the start of model cycles for simplicity.

# 3 Model parameters

## 3.1 Transition probabilities

### 3.1.1 Standard of care

The annual transition probabilities for SOC among our cohort of interest, say 25-year old patients, are defined as follows.

p_hd <- 0.002 # constant probability of dying when Healthy (all-cause mortality)
p_hs1 <- 0.15 # probability of becoming Sick when Healthy
p_s1h <- 0.5 # probability of becoming Healthy when Sick
p_s1s2 <- 0.105 # probability of becoming Sicker when Sick
p_s1d <- .006 # constant probability of dying when Sick
p_s2d <- .02 # constant probability of dying when Sicker

These can be used to derive the probabilities of remaining in the same state during a model cycle.

p_hh <- 1  - p_hs1 - p_hd
p_s1s1 <- 1 - p_s1h - p_s1s2 - p_s1d
p_s2s2 <- 1 - p_s2d

Using these probabilities, the transition probability matrix can then be defined.

p_soc <- matrix(
c(p_hh,  p_hs1,  0,      p_hd,
p_s1h, p_s1s1, p_s1s2, p_s1d,
0,     0,      p_s2s2, p_s2d,
0,     0,      0,      1),
byrow = TRUE,
nrow = 4, ncol = 4
)
state_names <- c("H", "S1", "S2", "D")
colnames(p_soc) <- rownames(p_soc) <- state_names
print(p_soc)
##        H    S1    S2     D
## H  0.848 0.150 0.000 0.002
## S1 0.500 0.389 0.105 0.006
## S2 0.000 0.000 0.980 0.020
## D  0.000 0.000 0.000 1.000

### 3.1.2 New treatment

In evidence synthesis and cost-effectiveness modeling, treatment effects are typically estimated in relative terms; i.e., the effect of an intervention is assessed relative to a reference treatment. In this case, the SOC is the reference treatment and the effectiveness of the new treatment is assessed relative to the SOC. For simplicity, we will assume that treatment effects are defined in terms of the relative risk, which is assumed to reduce the probability of all transitions to a more severe health state by an equal amount. Note that this is somewhat unrealistic since relative risks are a function of baseline risk and would be unlikely to be equal for each transition. A more appropriate model would define treatment effects in terms of a measure that is invariant to baseline risk such as an odds ratio (which might still vary for each transition).

The relative risk can then be used to create the transition matrix for the new treatment. We will use assume the relative risk is $$0.8$$.

apply_rr <- function(p, rr = .8){
p["H", "S1"] <- p["H", "S1"] * rr
p["H", "S2"] <- p["H", "S2"] * rr
p["H", "D"] <- p["H", "D"] * rr
p["H", "H"] <- 1 - sum(p["H", -1])

p["S1", "S2"] <- p["S1", "S2"] * rr
p["S1", "D"] <- p["S1", "D"] * rr
p["S1", "S1"] <- 1 - sum(p["S1", -2])

p["S2", "D"] <- p["S2", "D"] * rr
p["S2", "S2"] <- 1 - sum(p["S2", -3])

return(p)
}
p_new <- apply_rr(p_soc, rr = .8)

## 3.2 Utility and costs

Utility and (annualized) costs are defined below. We consider two types of costs: medical and treatment. Both utility and costs are 0 in the death state. Treatment costs are assumed equal in each health state.

utility <- c(1, .75, 0.5, 0)
costs_medical <- c(2000, 4000, 15000, 0)
costs_treat_soc <- c(rep(2000, 3), 0)
costs_treat_new <- c(rep(12000, 3), 0)

# 4 Simulation

## 4.1 Health state probabilities

To simulate health state probabilities, we must specify a state vector at time 0. Each patient will start in the Healthy state.

x_init <- c(1, 0, 0, 0)

The state vector during cycle $$1$$ for SOC is computed using simple matrix multiplication.

x_init %*% p_soc
##          H   S1 S2     D
## [1,] 0.848 0.15  0 0.002

We can also easily compute the state vector at cycle 2 for SOC.

x_init %*%
p_soc %*%
p_soc
##             H      S1      S2        D
## [1,] 0.794104 0.18555 0.01575 0.004596

A for loop can be used to compute the state vector at each model cycle. We created a function in the rcea package to do this so that we can reuse it. The model will be simulated until a maximum age of 110; that is, for 85 years for a cohort of 25-year olds.

rcea::sim_markov_chain
## function (x0, p, n_cycles = 85)
## {
##     x <- matrix(NA, ncol = length(x0), nrow = n_cycles)
##     x <- rbind(x0, x)
##     colnames(x) <- colnames(p)
##     rownames(x) <- 0:n_cycles
##     for (t in 1:n_cycles) {
##         x[t + 1, ] <- x[t, ] %*% p
##     }
##     return(x)
## }
## <bytecode: 0x7fa6eff30438>
## <environment: namespace:rcea>

We perform the computation for both the SOC and new treatment. As expected, patients remain in less severe health states longer when using the new treatment.

x_soc <- sim_markov_chain(x_init, p_soc)
x_new <- sim_markov_chain(x_init, p_new)

head(x_soc)
##           H        S1         S2           D
## 0 1.0000000 0.0000000 0.00000000 0.000000000
## 1 0.8480000 0.1500000 0.00000000 0.002000000
## 2 0.7941040 0.1855500 0.01575000 0.004596000
## 3 0.7661752 0.1912946 0.03491775 0.007612508
## 4 0.7453638 0.1893399 0.05430532 0.010990981
## 5 0.7267385 0.1854578 0.07309990 0.014703854
head(x_new)
##           H        S1         S2           D
## 0 1.0000000 0.0000000 0.00000000 0.000000000
## 1 0.8784000 0.1200000 0.00000000 0.001600000
## 2 0.8315866 0.1547520 0.01008000 0.003581440
## 3 0.8078416 0.1634244 0.02291789 0.005816068
## 4 0.7913203 0.1641411 0.03627885 0.008259738
## 5 0.7771663 0.1624533 0.04948624 0.010894190

## 4.2 Expected costs and quality-adjusted life-years

To compute discounted costs and QALYS, it is helpful to write a general function for computing a present value.

rcea::pv
## function (z, dr, t)
## {
##     z/(1 + dr)^t
## }
## <bytecode: 0x7fa6f1bda238>
## <environment: namespace:rcea>

### 4.2.1 QALYs

To illustrate a computation, lets start with discounted expected QALYs after the first model cycle.

x_soc[2, ] # State occupancy probabilities after 1st cycle
##     H    S1    S2     D
## 0.848 0.150 0.000 0.002
invisible(sum(x_soc[2, 1:3])) # Expected life-years after 1st cycle
invisible(sum(x_soc[2, ] * utility)) # Expected utility after 1st cycle
sum(pv(x_soc[2, ] * utility, .03, 1)) # Expected discounted utility after 1st cycle
##  0.9325243

Now let’s compute expected (discounted) QALYs for each cycle.

compute_qalys <- function(x, utility, dr = .03){
n_cycles <- nrow(x) - 1
pv(x %*% utility, dr, 0:n_cycles)
}

qalys_soc <- x_soc %*% utility
dqalys_soc <- compute_qalys(x_soc, utility = utility)
dqalys_new <- compute_qalys(x_new, utility = utility)

head(qalys_soc)
##        [,1]
## 0 1.0000000
## 1 0.9605000
## 2 0.9411415
## 3 0.9271050
## 4 0.9145214
## 5 0.9023818
head(dqalys_soc)
##        [,1]
## 0 1.0000000
## 1 0.9325243
## 2 0.8871161
## 3 0.8484324
## 4 0.8125404
## 5 0.7784024
head(dqalys_new)
##        [,1]
## 0 1.0000000
## 1 0.9401942
## 2 0.8980022
## 3 0.8619435
## 4 0.8285724
## 5 0.7968343

### 4.2.2 Costs

We can do the same for costs. The first column is medical costs and the second column is treatment costs.

compute_costs <- function(x, costs_medical, costs_treat, dr = .03){
n_cycles <- nrow(x) - 1
costs <- cbind(
pv(x %*% costs_medical, dr, 0:n_cycles),
pv(x %*% costs_treat, dr, 0:n_cycles)
)
colnames(costs) <- c("medical", "treatment")
return(costs)
}

dcosts_soc <- compute_costs(x_soc, costs_medical, costs_treat_soc)
dcosts_new <- compute_costs(x_new, costs_medical, costs_treat_new)

head(dcosts_soc)
##    medical treatment
## 0 2000.000  2000.000
## 1 2229.126  1937.864
## 2 2419.321  1876.527
## 3 2581.884  1816.350
## 4 2721.140  1757.443
## 5 2839.541  1699.850
head(dcosts_new)
##    medical treatment
## 0 2000.000  12000.00
## 1 2171.650  11631.84
## 2 2293.695  11270.64
## 3 2391.402  10917.83
## 4 2473.004  10573.78
## 5 2541.624  10238.54

# 5 Cost-effectiveness analysis

We conclude by computing the incremental cost-effectiveness ratio (ICER) with the new treatment relative to SOC. It can be computed by summing the costs and QALYs we simulated in the previous section across all model cycles. Note that since we are assuming that transitions happen immediately, we exclude costs and QALYs measured at the start of the first model cycle.

(sum(dcosts_new[-1, ]) - sum(dcosts_soc[-1, ])) /
(sum(dqalys_new[-1, ]) - sum(dqalys_soc[-1, ]))
##  122946.8

For the sake of presentation, we might want to do a some extra work and create a nice looking table.

format_costs <- function(x) formatC(x, format = "d", big.mark = ",")
format_qalys <- function(x) formatC(x, format = "f", digits = 2)
make_icer_tbl <- function(costs0, costs1, qalys0, qalys1){
# Computations
total_costs0 <- sum(costs0)
total_costs1 <- sum(costs1)
total_qalys0 <- sum(qalys0)
total_qalys1 <- sum(qalys1)
incr_total_costs <- total_costs1 - total_costs0
inc_total_qalys <- total_qalys1 - total_qalys0
icer <- incr_total_costs/inc_total_qalys

# Make table
tibble(
Strategy = c("SOC", "New"),
Costs = c(total_costs0, total_costs1) %>%
format_costs(),
QALYs = c(total_qalys0, total_qalys1) %>%
format_qalys(),
Incremental costs = c("--", incr_total_costs %>%
format_costs()),
Incremental QALYs = c("--", inc_total_qalys %>%
format_qalys()),
ICER = c("--", icer %>% format_costs())
) %>%
kable() %>%
kable_styling() %>%
footnote(general = "Costs and QALYs are discounted at 3% per annum.",
footnote_as_chunk = TRUE)
}
make_icer_tbl(costs0 = dcosts_soc[-1, ], costs1 = dcosts_new[-1, ],
qalys0 = dqalys_soc[-1, ], qalys1 = dqalys_new[-1, ])
Strategy Costs QALYs Incremental costs Incremental QALYs ICER
SOC 204,123 21.08
New 464,390 23.19 260,266 2.12 122,946
Note: Costs and QALYs are discounted at 3% per annum.