# Joint models : Différence entre versions

## Introduction

An important goal of longitudinal studies is to characterize relationships between different types of response data.

For instance, in a PKPD population study, we may be interested in the relationship between certain pharmacokinetics (absorption, distribution, metabolism and excretion) and pharmacodynamics (biochemical and physiological effects) of a drug. To do this, we need to measure some of both types of response data for several individuals from the same population, then try and characterize their relationship.

Alternatively, many clinical trials and reliability studies generate both longitudinal and survival (time-to-event) data. For example, in HIV clinical trials the viral load and the concentration of CD4 cells are widely used as biomarkers for progression to AIDS when studying the efficacy of drugs to treat HIV-infected patients. We might then be interested in the relationship between these variables and events such as seroconversion or death.

Therefore, in general a joint model is one that allows us to simultaneously describe the distribution of different types of observations made on the same individual. We consider this as usual in the population context.

Suppose that we have $L$ different types of observations for individual $i$: $y_i^{(1)}=(y_{ij}^{(1)},1\leq j \leq n_{i1})$, $y_i^{(2)}=(y_{ij}^{(2)},1\leq j \leq n_{i2})$, ..., $y_i^{(L)}=(y_{ij}^{(L)},1\leq j \leq n_{i,L})$, where $n_{i,\ell}$ is the number of observations of type $\ell$ made on individual $i$. Note that $n_{i,\ell}$ may be different for different $\ell$ for the same individual, and the observation times $(t_{ij}^{(\ell)})$ too.

Denote $y_i$ the set of observations for individual $i$: $y_i = (y_i^{(1)},y_i^{(2)},\ldots,y_i^{(L)})$. For each individual, the joint probability distribution of the observations $y_i$ and the individual parameters $\psi_i$ can be decomposed as follows

$$\begin{eqnarray} \pyipsii(y_i,\psi_i;\theta) &=& \pcyipsii(y_i | \psi_i) \, \ppsii(\psi_i;\theta) \\ & =& \pcyipsii(y_i^{(1)},y_i^{(2)},\ldots,y_i^{(L)} | \psi_i) \, \ppsii(\psi_i;\theta) . \end{eqnarray}$$

We can then distinguish between different types of dependency between observations: independence, conditional independence and conditional dependence.

## Independent observations

Suppose first that the vector of individual parameters $\psi_i$ can be decomposed into $L$ independent sub-vectors $\psi_i^{(1)}$, $\psi_i^{(2)}$, ..., $\psi_i^{(L)}$ such that $y_i^{(\ell)}$ depends only on $\psi_i^{(\ell)}$:

$$\begin{eqnarray} \pyipsii(y_i,\psi_i;\theta) &=& \pyipsii\left(y_i^{(1)},y_i^{(2)},\ldots,y_i^{(L)},\psi_i^{(1)}, \psi_i^{(2)}, \ldots , \psi_i^{(L)};\theta\right) \\ &=& \prod_{\ell=1}^{L} \pmacro\left(y_i^{(\ell)},\psi_i^{(\ell)};\theta\right) \\ &=& \prod_{\ell=1}^{L} \pmacro\left(y_i^{(\ell)} | \psi_i^{(\ell)}\right) \pmacro\left(\psi_i^{(\ell)};\theta\right) . \end{eqnarray}$$

Here, joint modeling does not bring anything new to the picture because all information on $\psi_i^{(\ell)}$ is contained in the related set of observations $y_i^{(\ell)}$. We can therefore model separately each set of observations.

Example A PK and PD model for warfarin data

Here, 32 healthy volunteers received a 1.5 mg/kg single oral dose of warfarin, an anticoagulant normally used in the prevention of thrombosis.

We then measured at different times the warfarin plasma concentration $C$ and the prothrombin complex activity (PCA) $E$ for these patients. The figure represents the PK data (on the left) and the PD data (on the right).

 warfarin PK and PD data

First, we consider two entirely independent parametric models for each of the PK and PD data: a simple one compartment model $f_1$ for the PK and rebound model $f_2$ for the PD. For any $t>0$,

$$\begin{eqnarray} C(t) &=& \displaystyle{ \frac{D\, k_a}{V(k_a-k_e)} } \left( e^{-k_e \, t} - e^{-k_a \, t} \right) \\ E(t) &=& 100\left(\displaystyle{ \frac{\beta}{1+\beta} } e^{-\alpha \, t} + \displaystyle{ \frac{1}{1+\beta \, e^{-\gamma \, t} } }\right) . \end{eqnarray}$$

We can then model the observations supposing for example a combined error model for the PK data and an additive one for the PD data:

 $$\begin{array}{c} y_{ij}^{(1)} &=& C(t_{ij}^{(1)} ; \psi_i^{(1)}) + (a_1 + b_1\,C(t_{ij}^{(1)};\psi_i^{(1)}))\teps_{ij}^{(1)} \end{array}$$ (1)
 $$\begin{array}{c} y_{ij}^{(2)} &=& E(t_{ij}^{(2)} ; \psi_i^{(2)}) + a_2 \, \teps_{ij}^{(2)} , \end{array}$$ (2)

where $\psi_i^{(1)}=(ka_i,V_i, ke_i)$ and $\psi_i^{(2)}=(\alpha_i,\beta_i,\gamma_i)$ are independent individual parameter vectors that we suppose log-normally distributed.

Now that the two models have been defined, we can jointly model the two data types. As they are independent, this means that we can simply use the PK model to fit the concentration data and the PD model to fit the PCA data. The figure shows the observed data and the individual predictions given by the two models for the 4 individuals.

 Jointly fitted PK and PD warfarin data for 4 individuals using two independent models

In the same way that we jointly modeled these two types of independent continuous data, we can construct joint models using different types of data at the same time, i.e., various combinations of continuous, categorical, count and survival data, etc., if they are independent.

Example Longitudinal and time-to-event data model

Consider the following joint model for survival and longitudinal data:

$$\begin{eqnarray} y_{ij} &=& f(t_{ij} ; \psi_i^{(1)}) + g(t_{ij} ;\psi_i^{(1)})\teps_{ij} \\ \prob{T_i>t} &=& S(t ; \psi_i^{(2)}) . \end{eqnarray}$$

The continuous outcome $y_{ij}$ and the time to event $T_i$ are independent if $\psi_i^{(1)}$ and $\psi_i^{(2)}$ are independent.

Remark

If the event is drop-out, it is sometimes called MCAR (missing completely at random). This means that the continuous outcome does not provide any information about drop-out.

## Conditionally independent examples

In this case, the various observation types depend no longer only on disjoint (i.e., independent) individual parameters. We therefore write $\psi_i$ for the overall set of (partially or fully shared) individual parameters. Observations are nevertheless supposed independent when conditioning on $\psi_i$:

$$\begin{eqnarray} \pyipsii(y_i,\psi_i;\theta) &=& \pyipsii(y_i^{(1)},y_i^{(2)},\ldots,y_i^{(L)},\psi_i;\theta) \\ &=& \left( \prod_{\ell=1}^{L} \pmacro(y_i^{(\ell)} | \psi_i) \right) \pmacro(\psi_i;\theta) . \end{eqnarray}$$

In such cases, each observation provides information on the individual parameter vector $\psi_i$.

This is the most common case when we are simultaneously modeling different types of longitudinal data of the form:

$$\begin{eqnarray} y_{ij}^{(1)} &=& f_1(t_{ij}^{(1)} ; \psi_i) + g_1(t_{ij}^{(1)};\psi_i)\teps_{ij}^{(1)} \\ y_{ij}^{(2)} &=& f_2(t_{ij}^{(2)} ; \psi_i) + g_2(t_{ij}^{(2)};\psi_i)\teps_{ij}^{(2)} . \end{eqnarray}$$

Here, the predictions $f_1$ and $f_2$ both depend on the same vector of individual parameters, which induces dependency between the observations $y_{i}^{(1)}$ and $y_{i}^{(2)}$. However, these observations are conditionally independent if the residual errors $\teps_{ij}^{(1)}$ and $\teps_{ij}^{(2)}$ are independent.

Example A joint PKPD model for warfarin data

Pertinent PKPD models aim to establish a link between a drug's concentration and its effect.

An indirect response model assumes that a drug does not instantaneously affect the PD response. Instead, the drug affects a precursor which then influences the PD measure. Here, as warfarin levels increase, prothrombin synthesis is inhibited, which in turn has anti-coagulant effects. Such phenomena can be approximated with a very simple ODE-based mathematical model for the PD component (we use the same one compartment model for the PK component):

$$\begin{eqnarray} C(t) &=& \displaystyle{ \frac{D\, k_a}{V(k_a-k_e)} } \left( e^{-k_e \, t} - e^{-k_a \, t} \right) \\ E(t) &=& \displaystyle{ \frac{k_{in} }{ k_{out} } }, \ \ \ \ t\leq 0 \\ \displaystyle{ \frac{d}{dt} }E(t) &=& k_{in}\left( 1 - \displaystyle{ \frac{C(t)}{IC_{50} + C(t)} } \right) - k_{out}\,E(t), \ \ \ \ t >0 . \end{eqnarray}$$

We could then use the same residual error models (1) and (2) given in the previous example.

We can also suppose that the vectors $\psi_i^{(1)}=(ka_i,V_i, ke_i)$ and $\psi_i^{(2)}=(IC_{50,i},k_{in,i},k_{out,i})$ are independent, but the fact that the effect $E$ predicted by the model is a function of the concentration $C$ introduces dependence between the two observation types because both depend on the PK parameters $\psi_i^{(1)}$.

If the residual errors $(\teps_{ij}^{(1)})$ and $(\teps_{ij}^{(2)})$ are independent, then the observations are conditionally independent, i.e., when the predicted concentration $C(t)$ is given, the observed concentrations $\by^{(1)}$ do not bring any further information on the distribution of the PD observations $\by^{(2)}$.

This joint model can be used to model the same warfarin data as before (again, using $\monolix$). The figure shows the resulting individual predictions.

 Fitted PK and PD warfarin data for 4 individuals using a conditionally independent joint model

We can extend this framework to different types of data, considering for example categorical observations $y_i^{(2)}$ for which the probabilities $\prob{y_{ij}^{(2)} = k}$ depend on $f_1(t_{ij}^{(2)};\psi_i)$ and consequently $\psi_i$. We can also consider survival data for which the risk function depends on $f_1$.

Example Longitudinal and time-to-event data model

Consider a joint model for survival and longitudinal data, assuming now that the hazard function (or equivalently the survival function) depends on the continuous data prediction:

$$\begin{eqnarray} y_{ij} &=& f(t_{ij} ; \psi_i) + g(t_{ij} ;\psi_i)\teps_{ij} \\ \prob{T_i>t} &=& S(t ; f(t ; \psi_i)) . \end{eqnarray}$$

If for instance $(y_{ij})$ is the measured viral load of an HIV infected patient, we can assume that the probability of events such as death, seroconversion or drop-out depends on the "true" viral load $f(t ; \psi_i)$.

Remark

if the event is drop-out, it is sometimes called MAR (missing at random). This means that the probability of drop-out depends on some of the individual parameters, but that the observation itself of the continuous outcome does not provide any additional information. In our example, this means that the probability that a patient leaves the study depends on their true state (i.e., their true but unknown viral load), and not on the measured viral load values.

## Conditionally dependent observations

In this case, there is a dependency structure between types of observation that no longer allows us to decompose the joint model into a product of models with only one type of observation in each.

This kind of dependency occurs when several types of longitudinal data are obtained at the same times, with correlated measurement errors. The joint conditional distribution $\qcyipsii$ of the observations is Gaussian if the residual errors are. The dependency structure between observations can then be characterized by a variance-covariance matrix for the errors.

We can also consider a natural decomposition of this joint distribution into a product of conditional distributions:

$$\begin{eqnarray} \pyipsii(y_i,\psi_i;\theta) &=& \pyipsii(y_i^{(1)},y_i^{(2)},\ldots,y_i^{(L)},\psi_i;\theta) \\ &=& \pmacro(y_i^{(1)} | \psi_i;\theta) \pmacro(y_i^{(2)} | y_i^{(1)}, \psi_i;\theta)\ldots \pmacro(y_i^{(L)} | y_i^{(1)},\ldots,y_i^{(L-1)}, \psi_i;\theta) \pmacro(\psi_i;\theta) . \end{eqnarray}$$

Here, the distribution of $y_i^{(2)}$ depends on the observation $y_i^{(1)}$, the distribution of $y_i^{(3)}$ depends on $y_i^{(1)}$ and $y_i^{(2)}$, etc.

Example A longitudinal data and drop-out model

Consider a joint model for longitudinal data and drop-out, assuming now that the hazard function (or equivalently the survival function) depends on the observed data itself:

$$\begin{eqnarray} y_{ij} &=& f(t_{ij} ; \psi_i) + g(t_{ij} ;\psi_i)\teps_{ij} \\ \prob{T_i>t} &=& S(t ; (y_{ij}, t_{ij}<t)) . \end{eqnarray}$$

This drop-out mechanism is sometimes called MNAR (missing not at random).

In this example where $(y_{ij}, t_{ij}<t)$ is the sequence of measured viral loads before time $t$, MNAR means that the probability that a patient leaves the study depends on their previously-measured viral concentrations.

## Bibliography

Albert, P. S., Follmann, D. A. - Modeling repeated count data subject to informative dropout

Biometrics 56(3):667-677,2004
Chi, Y.-Y., Ibrahim, J. G. - Joint models for multivariate longitudinal and multivariate survival data
Biometrics 62(2):432-445,2006
De Gruttola, V., Tu, X. M. - Modelling progression of CD4-lymphocyte count and its relationship to survival time
Biometrics pp. 1003-1014,1994
Henderson, R., Diggle, P., Dobson, A. - Joint modelling of longitudinal measurements and event time data.
Biostatistics 1(4):465-480,2000
Hsieh, F., Tseng, Y.-K., Wang, J.-L. - Joint modeling of survival and longitudinal data: likelihood approach revisited
Biometrics 62(4):1037-1043,2006
Hu, C., Sale, M. E. - A joint model for nonlinear longitudinal data with informative dropout
Journal of pharmacokinetics and pharmacodynamics 30(1):83-103,2003
Liu, L., Huang, X. - Joint analysis of correlated repeated measures and recurrent events processes in the presence of a dependent terminal event
J. ROY. STAT. SOC. C-APP. 58:65-81,2009
Rizopoulos, D. - Joint Models for Longitudinal and Time-to-Event Data. With Applications in R.
Chapman & Hall/CRC Biostatistics, Boca Raton,2012
Song,X., Davidian,M., Tsiatis,A. A. - Joint frailty models for recurring events and death using maximum penalized likelihood estimation: application on cancer events.
Biometrics 58(4):742-753,2004
Tsiatis, A. A., Davidian, M. - Joint modeling of longitudinal and time-to-event data: an overview
Statistica Sinica 14(3):809-834,2004
Wu, L. - A joint model for nonlinear mixed-effects models with censoring and covariates measured with error, with application to AIDS studies
Journal of the American Statistical association 97(460):955-964,2002
Wulfsohn, M. S., Tsiatis, A. A. - A joint model for survival and longitudinal data measured with error
Biometrics pp. 330-339,1997