# Model specification, log-likelihood, scores and second derivatives

## Notation

$$\boldsymbol{\beta}$$ — vertical coefficient vector.
$$\boldsymbol{X}$$ — Covariate matrix with one row per observation.
$$\boldsymbol{X_i}$$ — i’th row from $$\boldsymbol{X}$$
$$\boldsymbol{Y}$$ — Vertical binary outcome vector.
$$k$$ — number of covariates.
$$n$$ — number of observations.
$$i$$ — observation index.
$$j$$ — covariate index.

\begin{align} \boldsymbol{\beta} = \begin{bmatrix} \beta_0 \\ \beta_1 \\ \vdots \\ \beta_k \end{bmatrix} \quad \boldsymbol{X} = \begin{bmatrix} 1 & X_{1, 1} & X_{2, 1} & \ldots & X_{k, 1} \\ 1 & X_{1, 2} & X_{2, 2} & \ldots & X_{k, 2} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & X_{1, n} & X_{2, n} & \ldots & X_{k, n} \\ \end{bmatrix} = \begin{bmatrix} \boldsymbol{X_1} \\ \boldsymbol{X_2} \\ \vdots \\ \boldsymbol{X_n} \end{bmatrix} \quad \boldsymbol{Y} = \begin{bmatrix} Y_1 \\ Y_2 \\ \vdots \\ Y_3 \end{bmatrix} \end{align}

## Model

$P(Y_i = 1) = \frac{\lambda}{1 + \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta})} = \frac{\text{exp}(\theta)}{(1 + \text{exp}(\theta))(1 + \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}))}$

$$\theta$$ is the logit transformation of $$\lambda$$: $$\theta = \text{log}(\frac{\lambda}{1-\lambda})$$

Optimisation is done using the $$\theta$$ parameterisation because it does not constrain the likelihood.

## Log likelihood

$l(\theta, \boldsymbol{\beta}) = \sum_i \ y_i \ \theta - \text{log} \big( 1+\text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \big) - \text{log} \big( 1+\text{exp}(\theta) \big) + (1-y_i)\text{log} \Big( 1 + \text{exp} \big( \boldsymbol{X_i}\boldsymbol{\beta} \big) \big( 1 + \text{exp}(\theta) \big) \Big)$

## Scores

\begin{align} \begin{bmatrix} \frac{dl}{d\lambda} \\ \frac{dl}{d\beta_j} \end{bmatrix} = \begin{bmatrix} \sum_i y_i - \frac{\text{exp}(\theta)}{1 + \text{exp}(\theta)} + \frac{ (1 - y_i) \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \text{exp}(\theta) }{ 1 + \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \big( 1 + \text{exp}(\theta) \big) }\\ \sum_i x_{j, i} \Big( -\frac{ \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) }{ 1 + \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) } + \frac{ (1 - y_i) \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \big( 1 + \text{exp}(\theta) \big) }{ 1 + \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \big( 1 + \text{exp}(\theta) \big) } \Big) \\ \end{bmatrix} \end{align}

## Second derivatives

\begin{align} \begin{array}{cc} \begin{matrix} \frac{dl}{d\lambda} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad & \frac{dl}{d\beta_j} \end{matrix}\\ \begin{matrix} \frac{dl}{d\lambda} \\ \frac{dl}{d\beta_j} \\ \end{matrix} \begin{bmatrix} \sum_i - \frac{ \text{exp}(\theta) }{ \big( 1+\text{exp}(\theta) \big)^2 } + \frac{ (1-y_i)(1+\text{exp}(\boldsymbol{X_i}\boldsymbol{\beta})) \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \text{exp}(\theta) }{ \Big( 1 + \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \big( 1 + \text{exp}(\theta) \big) \Big)^2 } & \sum_i x_{j,i} \Big( \frac{ (1-y_i) \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \text{exp}(\theta) }{ \Big( 1 + \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \big( 1 + \text{exp}(\theta) \big) \Big)^2 } \Big) \\ . & \sum_i x_{j,i} \Big( - \frac{ \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) }{ \big( 1 + \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \big)^2 } + \frac{ (1-y_i)(1+\text{exp}(\theta))\text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) }{ \Big( 1 + \text{exp}(\boldsymbol{X_i}\boldsymbol{\beta}) \big( 1 + \text{exp}(\theta) \big) \Big)^2 } \Big) \\ \end{bmatrix} \end{array} \end{align}

## References

Dunning AJ (2006). “A model for immunological correlates of protection.” Statistics in Medicine, 25(9), 1485-1497. https://doi.org/10.1002/sim.2282.