Introduction
For the moment, we are still considering for each subject $i$ that there is only one scalar parameter $\psi_i$. The covariate model then consists of defining the prediction $\hpsi_i$ as a function of the subject's covariates $\trcov{c}_i$ and the fixed effects $\bbeta$:
\(
\hpsi_i = \hmodel(\bbeta,\trcov{c}_i).
\)

(1)

We take a statistical approach here. The goal is not necessarily to construct a causal model that supposes a causeeffect relationship between
covariates and the parameter, but one where the covariates partially describe the variability of the parameter.
Consider for example a very simple model that posits a linear relationship between the height $h_i$ of subject $i$ and their weight $w_i$:
\(
\hw_i \ = \ \hmodel(\bbeta,h_i) \ = \ \beta_0 + \beta_1 \, h_i. \)
The parameters $\bbeta = (\beta_0, \beta_s)$ are population parameters, which may vary from one population to the next, but are considered fixed within the same homogeneous population. In this model, the height is a covariate that:
 helps to predict the weight. For an individual of height $h_i$, we predict the weight $\beta_0 + \beta_1 \, h_i$. We use this model without necessarily supposing that there is a causeeffect relationship between height and weight. We merely assume that having information about height gives us some information about weight.
 helps to describe the variability of the weight. Suppose that we make the arbitrary choice of a "reference individual" in the population who has height $h_{\rm pop}$ and weight $w_{\rm pop}$. Then the model lets us show the link between the reference height and weight:
\( w_{\rm pop} = \beta_0 + \beta_1 \, h_{\rm pop}. \)
 Then we can more clearly look at the variability in weight around the reference weight as a function of the variation in height around the reference height:
\( \pred{w}_i  w_{\rm pop} = \beta_1 \, (h_i  h_{\rm pop}). \)
 If the weight is in kg and the height in cm, then for an individual who is 1cm taller than the reference height, we predict a weight of $\beta_1 \,kg$ above the reference weight.
In more general examples, there is a vector of reference covariates $\trcov{c}_{\rm pop}$. A reference individual is one who would personally have these covariate values. Consequently, $\psi_{\rm pop}=\hmodel(\bbeta,\trcov{c}_{\rm pop})$ is the predicted value of the individual parameter for this virtual individual.
The covariate model therefore describes how $\hpsi_i$ falls around $\psi_{\rm pop}$ as $\trcov{c}_i$ varies around $\trcov{c}_{\rm pop}$:
\(
\hpsi_i  \psi_{\rm pop} = \hmodel(\bbeta,\trcov{c}_i)  \hmodel(\bbeta,\trcov{c}_{\rm pop}).
\)

(2)

For clarity, in the following we distinguish between linear and nonlinear continuous covariate models, and categorical variable models.
Linear models for continuous covariates
In its most simple form, a linear model is one where the individual parameter is modeled as a linear combination of the covariates, i.e.,
\( \hpsi_i \ \ = \ \ \langle \bbeta , \trcov{c}_i \rangle \ \ = \ \ \sum_{\ell=1}^{L} \beta_{\ell}\, \trcov{c}_{i\ell} \, .
\)
Here, the function $\hmodel$ is the inner product of $\bbeta$ and $\trcov{c}_i$.
With respect to a reference individual, this can be rewritten as in (2):
\( \hpsi_i \ \ = \ \ \psi_{\rm pop} + \langle \bbeta , \trcov{c}_i  \trcov{c}_{\rm pop} \rangle. \)
More generally, we usually suppose that the linearity can be with respect to a transformation $h$ of $\hpsi_i$:
\(
h(\hpsi_i) \ \ = \ \ h(\psi_{\rm pop})+ \langle \bbeta , \trcov{c}_i  \trcov{c}_{\rm pop} \rangle .
\)

(3)

$h$ is the transform described in (4) of Gaussian models, such that $h(\psi_i)$ can be supposed Gaussian. As well as covariates such as height and age, $\trcov{c}_i$ may also include transformed ones, e.g., $\log$weight, weight/(height$^2$), etc.
By combining (4) of Gaussian models and (3), we thus obtain the following equivalent representations of $\psi_i$:
\(\begin{eqnarray}
h(\psi_i) & = & h(\psi_{\rm pop})+ \langle \bbeta , \trcov{c}_i  c_{\rm pop} \rangle + \eta_i \,, \quad \eta_i \sim {\cal N}(0,\omega^2) \\
h(\psi_i) & \sim & {\cal N}(h(\psi_{\rm pop})+ \langle \bbeta , \trcov{c}_i  c_{\rm pop} \rangle , \omega^2).
\end{eqnarray}\)

(4)

This model gives a clear and easily interpreted decomposition of the variability of $h(\psi_i)$ around $h(\psi_{\rm pop})$, i.e., of
$\psi_i$ around $\psi_{\rm pop}$:
i) The fixed component $\langle \bbeta , (\trcov{c}_i  \trcov{c}_{\rm pop}) \rangle$ describes part of this variability by way of covariates $\trcov{c}_i$ that fluctuate around $\trcov{c}_{\rm pop}$.
ii) The random component $\eta_i$ describes the remaining variability, i.e., variability between subjects that have the same covariate values.
By definition, a mixedeffects model combines these two components: fixed and random effects. In linear covariate models, these two effects combine additively.
Here, the vector of population parameters is $\theta = (\psi_{\rm pop},\bbeta,\omega^2)$. We can then use (5) of Gaussian models to give the pdf of $\psi_i$:
\(
\ppsii(\psi_i;\trcov{c}_i ,\theta)= \displaystyle{ \frac{h^\prime(\psi_i)}{\sqrt{2 \pi \omega^2} } }\exp \left\{\frac{1}{2 \, \omega^2} (h(\psi_i)  h(\psi_{\rm pop})  \langle \bbeta , \trcov{c}_i  \trcov{c}_{\rm pop} \rangle)^2 \right\} ,
\)

(5)

and the likelihood function:
\( {\like}(\theta ; \psi_1,\psi_2,\ldots,\psi_N) \ \ \eqdef \ \ \prod_{i=1}^{N}\ppsii(\psi_i;\trcov{c}_i ,\theta). \)
The Maximum Likelihood Estimate (MLE) of $\theta$ has a closed form here since the model is linear. Let
\(\begin{eqnarray}
\xi &=& \left(
\begin{array}{c}
h(\psi_{\rm pop}) \\
\beta_1 \\
\vdots \\
\beta_L \\
\end{array}
\right)
, \quad
h(\bpsi) = \left(
\begin{array}{c}
h(\psi_1) \\
h(\psi_2) \\
\vdots \\
h(\psi_N) \\
\end{array}
\right)
, \quad
C = \left(
\begin{array}{cccc}
1 & \trcov{c}_{1,1} \trcov{c}_{\rm pop,1} & \ldots & \trcov{c}_{1,L} \trcov{c}_{\rm pop,L} \\
1 & \trcov{c}_{2,1} \trcov{c}_{\rm pop,1} & \ldots & \trcov{c}_{2,L} \trcov{c}_{\rm pop,L} \\
\vdots & \vdots & \ddots & \vdots \\
1 & \trcov{c}_{N,1} \trcov{c}_{\rm pop,1} & \ldots & \trcov{c}_{N,L} \trcov{c}_{\rm pop,L} \\
\end{array}
\right).
\end{eqnarray}\)
Then, $\hat{\xi} \ = \ (C \, C^\prime)^{1} \, C^\prime \, h(\bpsi)$, and
\(\begin{eqnarray}
\hat{\omega}^2 &=& \frac{1}{N} \ h(\bpsi)  C \hat{\xi} \^2 \\
&=& \frac{1}{N} \sum_{i=1}^{N} \left(h(\psi_i)  h(\hpsi_{\rm pop})  \sum_{\ell=1}^{L} \hat{\beta}_\ell (\trcov{c}_{i,\ell}  \trcov{c}_{\rm pop,\ell}) \right)^2 .
\end{eqnarray}\)
Remarks
1. Let $ d_{i,\ell} = \trcov{c}_{i,\ell}  \trcov{c}_{\rm pop,\ell}$ and $\teta_i = \omega^{1}\eta_i$. Then
(4) can be written

\(
h(\psi_i) = h(\psi_{\rm pop}) + \beta_1 d_{i,1} + \beta_2 d_{i,2} + \ldots
+ \beta_L d_{i,L} + \omega \teta_i , \quad \teta_i \sim {\cal N}(0,1).
\)

(6)

Here, $d_{i,1}$, $d_{i,2}$, ..., $d_{i,L}$ and $\teta_i$ represent the effects that contribute to the fluctuations of $h(\psi_i)$ around $h(\psi_{\rm pop})$. Coefficients $\beta_1$, $\beta_2$, $\ldots$, $\beta_L$ and $\omega$ represent the magnitude of these effects. If the $\ell$th coefficient is zero, this means that the $\ell$th covariate has no effect. Similarly, $\omega =0$ signifies that there is no random effect.
2. The $d_{i,\ell}$ and the random effect $\teta_i$ play similar roles. The difference is essentially that the $d_{i,\ell}$ are "known" in the modeling context, unlike $\teta_i$.
If the context is simulation, all of them are random variables with their own specified distributions. We can therefore consider a random effect like a covariate that is not observed.
Example 1:
In this example, the individual parameter $\psi_i$ is the
volume of distribution $V_i$, which we could assume to be $\log$normally distributed. The weight $w_i$ (kg) can be used to explain part of the variability of the volume between individuals:
\(
\log(V_i) = \log (V_{\rm pop}) + \beta (\log(w_i) \log(70)) + \eta_{i},
\)

(7)

where $\eta_{i} \sim {\cal N}(0, \omega_V^2)$.
Here, the covariate used in the statistical model is the logweight and the reference weight that we decide to choose is $70$kg.
Of course, it would be absolutely equivalent to define the covariate as $c_i=\log(w_i/70)$. Then, the reference value of this covariate would become $c_{\rm pop}=0$ for an individual of 70kg, and model (7) can instead be written
\( \log(V_i) = \log (V_{\rm pop}) + \beta \, \log(w_i/70) + \eta_{i}. \)
The same model can be expressed in different ways. For instance, taking the exponential gives a model in terms of $V_i$:
\( V_i = \Vpop \left(\displaystyle{ \frac{w_i}{70} }\right)^{\beta} \, e^{\eta_{i} }. \)
Here, the predicted volume for an individual with weight $w_i$ is
\( \pred{V}_i = \Vpop \left(\displaystyle{ \frac{w_i}{70} }\right)^{\beta}. \)
The righthand side panel of the figure shows how the predicted volume $\pred{V}$ increases with weight $w$ for different values of $\beta$. Here, $\Vpop$ has been set at 10. For $\beta$ not equal to 0 or 1, the model is not linear. However, the predicted $\log$volume (lefthand side panel) does increase linearly with the $\log$weight:
\( \log(\pred{V}_i) = \log(\Vpop) + \beta \, \log(w_i/70). \)
Of course this model is not unique: there exist several possible transformations of the weight that ensure that the predicted volume increases with weight. Setting for example $c_i=w_i70$ assumes that the predicted logvolume increases linearly with the weight. These two covariate models give very similar predictions for $\beta$ close to 1 (which is a typical value for PK applications).
Example 2:
In this second example, we suppose that the bioavailability $F_i$ has a logitnormal distribution, and age $a_i$ (years) is used as a covariate with a reference age of 40 years:
\(\begin{eqnarray}
\logit(F_i) &=& \logit (\pred{F}_i) + \eta_{F,i} \\
&=& \logit (F_{\rm pop}) + \beta (a_i40) + \eta_{F,i} ,
\end{eqnarray}\)
where $\eta_{F,i} \sim {\cal N}(0, \omega_F^2)$. The predicted logitbioavailability for an individual of age $a_i$ is then
\(\logit(\pred{F}_i) = \logit (F_{\rm pop}) + \beta (a_i40). \)
We can derive from this equation an expression for $\pred{F}_i$:
\( \pred{F}_i = \displaystyle{ \frac{F_{\rm pop} }{F_{\rm pop} + (1 F_{\rm pop})e^{\beta (a_i40)} } }. \)
We see in this example how it is much easier to define a model for the transformed parameter $\logit({F}_i)$ than for $F_i$ itself. Furthermore, as the logit transform is strictly increasing, both vary in the same direction with respect to changes is $a_i$.
This figure shows how $\pred{F}_i$ and $\logit(\pred{F}_i)$ vary with age for several values of $\beta$.
Nonlinear models for continuous variables
Nonlinear models allow for much more general relationships between the covariate vector $\trcov{c}_i$ and the prediction $\hpsi_i$.
For equation (1) we now only assume that there exists some function $m$ and reference value $\psi_{\rm pop}$ such that
\(\begin{eqnarray}
\hpsi_i &=& \hmodel(\bbeta,\trcov{c}_i) \\
\psi_{\rm pop} &=& \hmodel(\bbeta,\trcov{c}_{\rm pop}).
\end{eqnarray}\)
We can either make the hypothesis that we are still in the Gaussian case, or not.
 If we hypothesize that we are still working with Gaussian models, then extending the linear model in (4) is straightforward: we suppose that there exists a monotone transformation $h$ such that
\(\begin{eqnarray}
h(\psi_i) &=& h(\hpsi_i)+ \eta_i \\
&=& \mmodel(\bbeta,\trcov{c}_i)+ \eta_i ,
\end{eqnarray}\)

(8)

 where $\mmodel(\bbeta,\trcov{c}_i)=h(\hmodel(\bbeta,\trcov{c}_i))$ is the prediction of $h(\psi_i)$. We can then derive the pdf of $\psi_i$ using (5) of Gaussian models as before:
\(
\ppsii(\psi_i;\trcov{c}_i , \theta )=\displaystyle{ \frac{h^\prime(\psi_i)}{\sqrt{2 \pi \omega^2} } } \exp\left\{\displaystyle{ \frac{1}{2 \, \omega^2} } (h(\psi_i)  \mmodel(\bbeta,\trcov{c}_i))^2 \right\},
\)

(9)

 where $\theta=(\bbeta,\omega^2)$. The only difference with the Gaussian linear model is that now there is no explicit form available for the MLE of $\theta$. Instead, it is defined as the solution of an optimization problem:
\(\begin{eqnarray}
\hat{\bbeta} &= &\argmin{\bbeta} \left\{ \sum_{i=1}^{N} \left( h(\psi_i)  \mmodel(\bbeta,\trcov{c}_i) \right)^2 \right\} \\
\hat{\omega}^2 &=& \frac{1}{N} \sum_{i=1}^{N} \left( h(\psi_i)  \mmodel(\hat{\bbeta},\trcov{c}_i) \right)^2 .
\end{eqnarray}\)
Example 3:
Consider the following model for $\psi_i$:
\( \psi_i = \displaystyle{ \frac{\beta_1\, e^{\eta_i} }{1 + \beta_2 \, \trcov{c}_i} }, \)
where $\eta_i \sim {\cal N}(0, \omega^2)$ is assumed. We are going to suppose that the $\log$ of $\psi_i$:
\( \log(\psi_i) \ \ = \ \ \log\left( \displaystyle{ \frac{\beta_1}{1 + \beta_2 \, \trcov{c}_i} } \right) + \eta_i \, , \)
is Gaussian. Here, $h$ is the $\log$ function, $\hpsi_i = \beta_1/(1 + \beta_2 \, \trcov{c}_i)$ and $\psi_{\rm pop}=\beta_1/(1 + \beta_2 \, \trcov{c}_{\rm pop})$.
Therefore,
\( \begin{eqnarray}
\log(\psi_i) &\sim& {\cal N}\left( \log\left( \displaystyle{ \frac{\beta_1}{1 + \beta_2 \, \trcov{c}_i} } \right) , \omega^2\right) ,
\end{eqnarray}\)
and the optimization problem to solve for the MLE is:
\( \begin{eqnarray}
(\hat{\beta_1},\hat{\beta_2})& =& \argmin{\beta_1,\beta_2} \sum_{i=1}^{N} \left(\log(\psi_i)  \log\left( \displaystyle{ \frac{\beta_1}{1 + \beta_2 \, \trcov{c}_i} }\right) \right)^2 \\
\hat{\omega}^2 &=& \frac{1}{N} \sum_{i=1}^{N} \left(\log(\psi_i)  \log\left( \displaystyle{ \frac{\hat{\beta}_1}{1 + \hat{\beta}_2 \, \trcov{c}_i} } \right) \right)^2.
\end{eqnarray}\)
 For more general distributions of $\psi_i$, we can simply define $\psi_i$ as a function of fixed and random effects:
\(
\psi_i = \model(\bbeta,\trcov{c}_i,\eta_i).
\)

(10)

 The prediction $\hpsi_i$ is obtained when setting $\eta_i \equiv 0$:
\(
\hpsi_i = \model(\bbeta,\trcov{c}_i,\eta_i\equiv 0),
\)
 and the population value of $\psi$ when $c_i = c_{\rm pop}$:
\(
\psi_{\rm pop} = \model(\bbeta,\trcov{c}_i \equiv \trcov{c}_{\rm pop},\eta_i\equiv 0).
\)
 If the random effects are supposed Gaussian, there always exists an underlying Gaussian model which describes the distribution of $\psi_i$. Let $\imodel$ be the function obtained by rearranging (10) as a function of $\eta_i$:
\(\eta_i = \imodel(\bbeta,\trcov{c}_i,\psi_i). \)
 We can then derive the pdf of $\psi_i$ from that of $\eta_i$,
\(
\ppsii(\psi_i;\trcov{c}_i, \theta )=\displaystyle{ \frac{ \partial}{\partial\psi} }\imodel(\bbeta,\trcov{c}_i,\psi_i) \displaystyle{ \frac{1}{\sqrt{2 \pi \omega^2} } }\exp\left\{\displaystyle{ \frac{\imodel^2(\bbeta,\trcov{c}_i,\psi_i)}{2 \, \omega^2} } \right\},
\)

(11)

 where $\theta=(\bbeta,\omega^2)$ is the vector of population parameters of the model. We can then state the likelihood function:
\( {\like}(\theta ; \psi_1,\psi_2,\ldots,\psi_N) \ \eqdef \ \prod_{i=1}^{N}\ppsii(\psi_i;\trcov{c}_i, \theta ).
\)
 The distribution $\qpsii$ and the likelihood ${\like}$ have closed forms if and only if the inverse function $\imodel$ can be computed in closed form, which is not always the case.
Example 4:
Suppose that we model with
\( \psi_i = \displaystyle {\frac{\beta_1 \, e^{\eta_i} }{1 + \beta_2 \, \trcov{c}_i \, e^{\eta_i} } },
\)
where $\eta_i \sim {\cal N}(0, \omega^2)$ is assumed.
As before, $\hpsi_i$ is obtained when $\eta_i$ is set to 0: $\hpsi_i = \beta_1/(1 + \beta_2 \, c_i)$, and $\psi_{\rm pop}=\beta_1/(1 + \beta_2 \, \trcov{c}_{\rm pop})$.
In this example, it is possible to rearrange the formula for $\eta_i$:
\(
\eta_i \ \ = \log \left( \displaystyle{ \frac{\psi_i}{\beta_1  \beta_2 \, \trcov{c}_i \, \psi_i} } \right) \ \ \sim \ \ {\cal N}( 0 , \omega^2).
\)
It is therefore possible to explicitly give the distribution of $\psi_i$ and the likelihood ${\like}$ using
(10) with $\imodel(\bbeta,\trcov{c}_i,\psi_i)= \log \left( \psi_i/(\beta_1  \beta_2 \, \trcov{c}_i \, \psi_i) \right)$.
Example 5:
Let us now propose a model that has a small modification with respect to the previous one,
\( \psi_i = \displaystyle{ \frac{\beta_1 + \eta_i}{1 + \beta_2 \, \trcov{c}_i \, e^{\eta_i} } }.
\)
The predictions $\hpsi_i$ and $\psi_{\rm pop}$ can both be described as before, but it is no longer possible to explicitely inverse the formula in order to express $\eta_i$ as a function of $\psi_i$. Therefore we cannot explicitly write the likelihood ${\like}$ in this case.
Remarks
Even though the great flexibility of such models appears attractive at first glance, we must remain attentive to what we want to use them for and the tasks we want to perform. In a modeling context, remember that the individual parameters are not observed. The choice of using a complex model for such variables can pose several problems, for model identification and parameter estimation. Even though software like NONMEM allows us to define nonlinear models of covariates, it is not realistic to think that it can correctly estimate them, since the FO and FOCE algorithms implemented in NONMEM are based on linearization of the model. In this way, a user can precisely define a model, but has little control over the quality of the output.
The linear model proposed in (3) has certain limits due to the fact that it cannot represent all possible and imaginable models, but it remains sufficiently flexible (due to being able to choose a parameter transform $h$ and covariate transforms) and robust (see Tasks & Tools Section) to be successfully used in most situations.
$\mlxtran$ allows to write any linear or non linear model of covariates. Then, such model can be easily used for simulation (using the R/Matlab function
simulmlx for instance). On the other hand, only linear models of covariates can be used for estimation with $\monolix$.
A model for categorical covariates
Categorical variables take a finite number of values from some set that is not necessarily numerical or even ordered, e.g., gender, country and ethnicity.
The approach taken for continuous covariates extends easily to categorical ones.
For simplicity's sake, let us consider a unique covariate $\trcov{c}_i$ that takes its values in $\{ a_1, a_2, \ldots, a_K\}$, and a unique parameter $\psi_i$. A reference covariate value $\trcov{c}_{\rm pop}$ here is a reference category, i.e., a specific element $a_{\kref}$ of $\{ a_1, a_2, \ldots, a_K\}$. The prediction of $\psi_i$ is thus given by the following model:
\(
h(\hpsi_i) = h(\psi_{\rm pop})+ \beta_1 \one_{\trcov{c}_i=a_1} + \beta_2 \one_{\trcov{c}_i=a_2} + \ldots + \beta_K \one_{\trcov{c}_i=a_K},
\)

(12)

with $\beta_{\kref} = 0$. Then, (12) is equivalent to
\(
h(\hpsi_i) = \left\{
\begin{array}{ll}
h(\psi_{\rm pop}) & {\rm if \quad} \trcov{c}_i=a_{\kref} \\
h(\psi_{\rm pop}) + \beta_k & {\rm if \quad } \trcov{c}_i=a_k \neq a_{\kref}
\end{array}
\right.
\)
We see that if the covariate has $K$ categories, then $K1$ coefficients $(\beta_k)$ are required for defining the covariate model.
Example 6:
Assume that the individual clearance (of a drug) depends on gender. Here, the gender $g_i$ of individual $i$ can either be female or male. We arbitrarily choose female as reference gender. Assuming a lognormal distribution for clearance, the model can be written as follows:
\( \log(Cl_i) = \log(Cl_{\rm pop}) + \beta \one_{g_i={\rm male} } + \eta_i , \)
and the predicted clearance is
\(
\pred{Cl}_i = \left\{
\begin{array}{ll}
Cl_{\rm pop} & {\rm if \quad g_i= female} \\
Cl_{\rm pop} \, e^\beta & {\rm if \quad g_i= male}.
\end{array}
\right.
\)
Example 7:
We want to model the variation in weight between individuals of three countries: India, US and China.
Assuming a normal distribution for weight and India as the reference country, we have:
\( w_i = w_{\rm pop} + \beta_1 \one_{o_i={\rm US} } + \beta_2 \one_{o_i={\rm China} } + \eta_i. \)
The predicted weight is therefore
\(
\pred{w}_i = \left\{
\begin{array}{ll}
w_{\rm pop} & {\rm if \quad o_i= India} \\
w_{\rm pop} + \beta_1 & {\rm if \quad o_i= USA} \\
w_{\rm pop} + \beta_2 & {\rm if \quad o_i= China}.
\end{array}
\right. \)
$\mlxtran$ for covariate models
Example 1 (linear model):
Two covariates:
 weight $w_i$: continuous covariate,
 gender $g_i$: categorical covariate, $g_i\in\{ {\rm F, M}\}$.
Vector of individual parameters: $\psi_i = (ka_i, V_i, Cl_i)$,
\(\begin{eqnarray}
\log(ka_i) &\sim& {\cal N}(\log(ka_{\rm pop}), \omega_{ka}^2) \\
\log(V_i) &\sim& {\cal N}(\log(V_{\rm pop}) + \beta_{V,w}\log(w_i/70), \omega_V^2) \\
\log(Cl_i) &\sim& {\cal N}(\log(Cl_{\rm pop}) + \beta_{Cl,w}\log(w_i/70) + \\
& & \beta_{Cl,g}\one_{g_i=M} , \omega_{Cl}^2) \\
\end{eqnarray}\)

MLXTran
[INDIVIDUAL]
input={ka_pop, V_pop, Cl_pop, beta_V, beta1_Cl, beta2_Cl,
omega_ka, omega_V, omega_Cl, weight, gender}
EQUATION:
lw70=log(weight/70)
DEFINITION:
ka = {distribution=lognormal,reference=ka_pop,d=omega_ka}
V = {distribution=normal,reference=V_pop,
covariate=lw70,coefficient=beta_V,sd=omega_V}
Cl = {distribution=normal,reference=Cl_pop,
covariate={lw70,gender},
coefficient={beta1_Cl,beta2_Cl},sd=omega_Cl}

(It is assumed here that gender has been previously defined as a categorical covariate with two categories {F, M} and F as reference category).
Example 2 (nonlinear model):
\( \begin{eqnarray}
\log(\psi_i) &\sim& {\cal N}\left( \log\left( \displaystyle{ \frac{\beta_1}{1 + \beta_2 \, \trcov{c}_i} }\right) , \omega^2\right)
\end{eqnarray}\)

MLXTran
[INDIVIDUAL]
input={beta1, beta2, omega, c}
EQUATION:
predpsi = beta1/(1+beta2*c)
DEFINITION:
psi = {distribution=normal,prediction=predpsi,sd=omega}
