Performs adjustments of a user-supplied independence loglikelihood for the presence of cluster dependence, following Chandler and Bate (2007). The user provides a function that returns a vector of observation-specific loglikelihood contributions and a vector that indicates cluster membership. The loglikelihood of a sub-model can be adjusted by fixing a set of parameters at particular values.
adjust_loglik( loglik = NULL, ..., cluster = NULL, p = NULL, init = NULL, par_names = NULL, fixed_pars = NULL, fixed_at = 0, name = NULL, larger = NULL, alg_deriv = NULL, alg_hess = NULL, mle = NULL, H = NULL, V = NULL )
loglik | A named function. Returns a vector of the
loglikelihood contributions of individual observations. The first
argument must be the vector of model parameter(s). If any of the model
parameters are out-of-bounds then |
---|---|
... | Further arguments to be passed either to |
cluster | A vector or factor indicating from which cluster the
respective loglikelihood contributions from |
p | A numeric scalar. The dimension of the full parameter
vector, i.e. the number of parameters in the full model. Must be
consistent with the lengths of |
init | A numeric vector of initial values. Must have length equal
to the number of parameters in the full model. If |
par_names | A character vector. Names of the |
fixed_pars | A vector specifying which parameters are to be restricted
to be equal to the value(s) in |
fixed_at | A numeric vector of length 1 or |
name | A character scalar. A name for the model that gives rise
to |
larger | Only relevant if An object of class |
alg_deriv | A function with the vector of model parameter(s) as its
first argument. Returns a |
alg_hess | A function with the vector of model parameter(s) as its
first argument. Returns a Supplying both |
mle | A numeric vector. Can only be used if |
H, V | p by p numeric matrices. Only used if Supplying both |
A function of class "chandwich"
to evaluate an adjusted
loglikelihood, or the independence loglikelihood, at one or more sets
of model parameters, with arguments
A numeric vector or matrix giving values of the p_current
(see below) parameters in the model to which the returned adjusted
loglikelihood applies.
If p_current = 1
this may be a numeric vector or a matrix
with 1 column.
If p_current > 1
this may be a numeric vector of length p
(one set of model parameters) or a numeric matrix with p
columns (nrow(x)
sets of model parameters), one set in each row
of x
.
A character scalar. The type of adjustment to use.
One of "vertical"
, "cholesky"
, "spectral"
or
"none"
.
The number of parameters in the full model and current models, respectively.
A numeric vector giving the indices of the free
parameters in the current model, with names inferred from
par_names
if this was supplied.
Numeric vectors, with names inferred from
par_names
if this was supplied. Maximum likelihood estimates
of free parameters under the current model (mle
) and all
parameters in the full model, including any parameters with fixed
values (res_MLE
).
The unadjusted and adjusted estimated standard errors, of the free parameters, respectively.
The unadjusted and adjusted estimated variance-covariance matrix of the free parameters, respectively.
The Hessians of the independence and adjusted loglikelihood, respectively.
The matrix C in equation (14) of Chandler and Bate (2007), calculated using Cholesky decomposition and spectral decomposition, respectively.
The names of the parameters in the full and current models, respectively, if these were supplied in this call or a previous call.
The common maximised value of the independence and adjusted loglikelihoods.
The arguments loglik
and cluster
supplied in this call, or a previous call.
A list containing the further arguments passed to
loglik
via ... in this call, or a previous call.
a vector containing the contributions of individual observations to the independence log-likelihood evaluated at the MLE.
The argument name
, or the name of the function
loglik
if name
isn't supplied.
The number of observations.
The call to adjust_loglik
.
The argument fixed_pars
, with names inferred from
par_names
if this was supplied.
The argument fixed_at
, with names inferred from
par_names
if this was supplied.
Three adjustments to the independence loglikelihood described in
Chandler and Bate (2007) are available. The vertical adjustment is
described in Section 6 and two horizontal adjustments are described
in Sections 3.2 to 3.4. See the descriptions of type
and, for the
horizontal adjustments, the descriptions of C_cholesky
and
C_spectral
, in Value.
The adjustments involve first and second derivatives of the loglikelihood
with respect to the model parameters. These are estimated using
jacobian
and optimHess
unless alg_deriv
and/or alg_hess
are supplied.
Chandler, R. E. and Bate, S. (2007). Inference for clustered data using the independence loglikelihood. Biometrika, 94(1), 167-183. doi: 10.1093/biomet/asm015
summary.chandwich
for maximum likelihood estimates
and unadjusted and adjusted standard errors.
plot.chandwich
for plots of one-dimensional adjusted
loglikelihoods.
confint.chandwich
, anova.chandwich
,
coef.chandwich
, vcov.chandwich
and logLik.chandwich
for other chandwich
methods.
conf_intervals
for confidence intervals for
individual parameters.
conf_region
for a confidence region for
a pair of parameters.
compare_models
to compare nested models using an
(adjusted) likelihood ratio test.
# ------------------------- Binomial model, rats data ---------------------- # Contributions to the independence loglikelihood binom_loglik <- function(prob, data) { if (prob < 0 || prob > 1) { return(-Inf) } return(dbinom(data[, "y"], data[, "n"], prob, log = TRUE)) } rat_res <- adjust_loglik(loglik = binom_loglik, data = rats, par_names = "p") # Plot the loglikelihoods plot(rat_res, type = 1:4, legend_pos = "bottom", lwd = 2, col = 1:4)#> MLE SE adj. SE #> p 0.1535 0.008645 0.01305# -------------------------- GEV model, owtemps data ----------------------- # ------------ following Section 5.2 of Chandler and Bate (2007) ----------- # Contributions to the independence loglikelihood gev_loglik <- function(pars, data) { o_pars <- pars[c(1, 3, 5)] + pars[c(2, 4, 6)] w_pars <- pars[c(1, 3, 5)] - pars[c(2, 4, 6)] if (isTRUE(o_pars[2] <= 0 | w_pars[2] <= 0)) return(-Inf) o_data <- data[, "Oxford"] w_data <- data[, "Worthing"] check <- 1 + o_pars[3] * (o_data - o_pars[1]) / o_pars[2] if (isTRUE(any(check <= 0))) return(-Inf) check <- 1 + w_pars[3] * (w_data - w_pars[1]) / w_pars[2] if (isTRUE(any(check <= 0))) return(-Inf) o_loglik <- log_gev(o_data, o_pars[1], o_pars[2], o_pars[3]) w_loglik <- log_gev(w_data, w_pars[1], w_pars[2], w_pars[3]) return(o_loglik + w_loglik) } # Initial estimates (method of moments for the Gumbel case) sigma <- as.numeric(sqrt(6 * diag(var(owtemps))) / pi) mu <- as.numeric(colMeans(owtemps) - 0.57722 * sigma) init <- c(mean(mu), -diff(mu) / 2, mean(sigma), -diff(sigma) / 2, 0, 0) # Loglikelihood adjustment for the full model par_names <- c("mu[0]", "mu[1]", "sigma[0]", "sigma[1]", "xi[0]", "xi[1]") large <- adjust_loglik(gev_loglik, data = owtemps, init = init, par_names = par_names) # Rows 1, 3 and 4 of Table 2 of Chandler and Bate (2007) t(summary(large))#> mu[0] mu[1] sigma[0] sigma[1] xi[0] xi[1] #> MLE 81.1700 2.6680 3.7290 0.5312 -0.19890 -0.08835 #> SE 0.3282 0.3282 0.2293 0.2293 0.04937 0.04937 #> adj. SE 0.4036 0.2128 0.2426 0.1911 0.03943 0.03625# Loglikelihood adjustment of some smaller models: xi[1] = 0 etc # Starting from a larger model medium <- adjust_loglik(larger = large, fixed_pars = "xi[1]") small <- adjust_loglik(larger = large, fixed_pars = c("sigma[1]", "xi[1]")) small <- adjust_loglik(larger = medium, fixed_pars = c("sigma[1]", "xi[1]")) # Starting from scratch medium <- adjust_loglik(gev_loglik, data = owtemps, init = init, par_names = par_names, fixed_pars = "xi[1]") small <- adjust_loglik(gev_loglik, data = owtemps, init = init, par_names = par_names, fixed_pars = c("sigma[1]", "xi[1]")) # --------- Misspecified Poisson model for negative binomial data ---------- # ... following Section 5.1 of the "Object-Oriented Computation of Sandwich # Estimators" vignette of the sandwich package # https://cran.r-project.org/web/packages/sandwich/vignettes/sandwich-OOP.pdf # Simulate data set.seed(123) x <- rnorm(250) y <- rnbinom(250, mu = exp(1 + x), size = 1) # Fit misspecified Poisson model fm_pois <- glm(y ~ x + I(x^2), family = poisson) summary(fm_pois)$coefficients#> Estimate Std. Error z value Pr(>|z|) #> (Intercept) 1.06326821 0.04135723 25.709367 9.184267e-146 #> x 0.99607219 0.05353446 18.606186 2.862861e-77 #> I(x^2) -0.04912373 0.02314608 -2.122335 3.380961e-02# Contributions to the independence loglikelihood pois_glm_loglik <- function(pars, y, x) { log_mu <- pars[1] + pars[2] * x + pars[3] * x ^ 2 return(dpois(y, lambda = exp(log_mu), log = TRUE)) } pars <- c("alpha", "beta", "gamma") pois_quad <- adjust_loglik(pois_glm_loglik, y = y, x = x, par_names = pars) summary(pois_quad)#> MLE SE adj. SE #> alpha 1.06300 0.04136 0.08378 #> beta 0.99610 0.05354 0.10520 #> gamma -0.04913 0.02315 0.03628# Providing algebraic derivatives and Hessian pois_alg_deriv <- function(pars, y, x) { mu <- exp(pars[1] + pars[2] * x + pars[3] * x ^ 2) return(cbind(y - mu, x * (y - mu), x ^2 * (y - mu))) } pois_alg_hess <- function(pars, y, x) { mu <- exp(pars[1] + pars[2] * x + pars[3] * x ^ 2) alg_hess <- matrix(0, 3, 3) alg_hess[1, ] <- -c(sum(mu), sum(x * mu), sum(x ^ 2 * mu)) alg_hess[2, ] <- -c(sum(x * mu), sum(x ^ 2 * mu), sum(x ^ 3 * mu)) alg_hess[3, ] <- -c(sum(x ^ 2 * mu), sum(x ^ 3 * mu), sum(x ^ 4 * mu)) return(alg_hess) } pois_quad <- adjust_loglik(pois_glm_loglik, y = y, x = x, p = 3, alg_deriv = pois_alg_deriv, alg_hess = pois_alg_hess) summary(pois_quad)#> MLE SE adj. SE #> 1 1.06300 0.04136 0.08378 #> 2 0.99610 0.05354 0.10520 #> 3 -0.04913 0.02315 0.03628if (got_sandwich) { # Providing MLE, H and V # H and V calculated using bread() and meat() from sandwich package n_obs <- stats::nobs(fm_pois) pois_quad <- adjust_loglik(pois_glm_loglik, y = y, x = x, p = 3, mle = fm_pois$coefficients, H = -solve(sandwich::bread(fm_pois) / n_obs), V = sandwich::meat(fm_pois) * n_obs) }