Processing math: 100%

A movie to illustrate the nature of the Wald, Wilks and score likelihood-based test statistics, for a model with a scalar unknown parameter θ. The user can change the value of the parameter under a simple null hypothesis and observe the effect on the test statistics and (approximate) p-values associated with the tests of this hypothesis against the general alternative. The user can specify their own log-likelihood or use one of two in-built examples.

wws(
  model = c("norm", "binom"),
  theta_range = NULL,
  ...,
  mult = 3,
  theta0 = if (!is.null(theta_range)) sum(c(0.25, 0.75) * theta_range) else NULL,
  panel_plot = TRUE,
  hscale = NA,
  vscale = hscale,
  delta_theta0 = if (!is.null(theta_range)) abs(diff(theta_range))/20 else NULL,
  theta_mle = NULL,
  loglik = NULL,
  alg_score = NULL,
  alg_obs_info = NULL,
  digits = 3
)

Arguments

model

A character scalar. Name of the the distribution on which one of two in-built examples are based.

If model = "norm" then the setting is a random sample of size n from a normal distribution with unknown mean mu = θ and known standard deviation sigma.

If model = "binom" then the setting is a random sample from a Bernoulli distribution with unknown success probability θ.

The behaviour of these examples can be controlled using arguments supplied via .... In particular, the data can be supplied using data. If model = "norm" then n, mu, and sigma can also be chosen. The default cases for these examples are:

  • model = "norm": n = 10, mu = 0, sigma = 1 and data contains a sample of a sample of size n simulated, using Normal, from a normal distribution with mean mu and standard deviation sigma.

  • model = "binom": data = c(7, 13), that is, 7 successes and 13 failures observed in 20 trials. For the purposes of this movie there must be at least one success and at least one failure.

theta_range

A numeric vector of length 2. The range of values of θ over which to plot the log-likelihood. If theta_range is not supplied then the argument mult is used to set the range automatically.

...

Additional arguments to be passed to loglik, alg_score and alg_obs_info if loglik is supplied, or to functions functions relating to the in-built examples otherwise. See the description of model above for details.

mult

A positive numeric scalar. If theta_range is not supplied then an interval of width 2 x mult standard errors centred on theta_mle is used. If model = "binom" then theta_range is truncated to (0,1) if necessary.

theta0

A numeric scalar. The value of θ under the null hypothesis to use at the start of the movie.

panel_plot

A logical parameter that determines whether the plot is placed inside the panel (TRUE) or in the standard graphics window (FALSE). If the plot is to be placed inside the panel then the tkrplot library is required.

hscale, vscale

Numeric scalars. Scaling parameters for the size of the plot when panel_plot = TRUE. The default values are 1.4 on Unix platforms and 2 on Windows platforms.

delta_theta0

A numeric scalar. The amount by which the value of theta0 is increased (or decreased) after one click of the + (or -) button in the parameter window.

theta_mle

A numeric scalar. The user may use this to supply the value of the maximum likelihood estimate (MLE) of θ. Otherwise, optim is used to search for the MLE, using theta0 as the initial value and theta_range as bounds within which to search.

loglik

An R function, vectorised with respect to its first argument, that returns the value of the log-likelihood (up to an additive constant). The movie will not work if the observed information is not finite at the maximum likelihood estimate.

alg_score

A R function that returns the score function, that is, the derivative of loglik with respect to θ.

alg_obs_info

A R function that returns the observed information that is, the negated second derivative of loglik with respect to θ.

digits

An integer indicating the number of significant digits to be used in the displayed values of the test statistics and p-values. See signif.

Value

Nothing is returned, only the animation is produced.

Details

The Wald, Wilks (or likelihood ratio) and Score tests are asymptotically equivalent tests of a simple hypothesis that a parameter of interest θ is equal to a particular value θ0. The test statistics are all based on the log-likelihood l(θ for θ but they differ in the way that they measure the distance between the maximum likelihood estimate (MLE) of θ and θ0. The Wilks statistic is the amount by which the log-likelihood evaluated θ0 is smaller than the log-likelihood evaluated at the MLE. The Walk statistics is based on the absolute difference between the MLE and θ0. The score test is based on the gradient of the log-likelihood (the score function) at θ0. For details see Azzalini (1996).

This movie illustrates the differences between the test statistics for simple models with a single scalar parameter. In the (default) normal example the three test statistics coincide. This is not true in general, as shown by the other in-built example (distn = "binom").

A user-supplied log-likelihood can be provided via loglik.

References

Azzalini, A. (1996) Statistical Inference Based on the Likelihood, Chapman & Hall / CRC, London.

See also

movies: a user-friendly menu panel.

smovie: general information about smovie.

Examples

# N(theta, 1) example, test statistics equivalent
wws(theta0 = 0.8)

# binomial(20, theta) example, test statistics similar
wws(theta0 = 0.5, model = "binom")

# binomial(20, theta) example, test statistic rather different
# for theta0 distant from theta_mle
wws(theta0 = 0.9, model = "binom", data = c(19, 1), theta_range = c(0.1, 0.99))

# binomial(2000, theta) example, test statistics very similar
wws(theta0 = 0.5, model = "binom", data = c(1000, 1000))

set.seed(47)
x <- rnorm(10)
wws(theta0 = 0.2, model = "norm", theta_range = c(-1, 1))

# Log-likelihood for a binomial experiment (up to an additive constant)
bin_loglik <- function(p, n_success, n_failure) {
  return(n_success * log(p) + n_failure * log(1 - p))
}

wws(loglik = bin_loglik, theta0 = 0.5, theta_range = c(0.1, 0.7),
    theta_mle = 7 / 20, n_success = 7, n_failure = 13)

bin_alg_score <- function(p, n_success, n_failure) {
  return(n_success / p - n_failure / (1 - p))
}
bin_alg_obs_info <- function(p, n_success, n_failure) {
  return(n_success / p ^ 2 + n_failure / (1 - p) ^ 2)
}
wws(loglik = bin_loglik, theta0 = 0.5, theta_range = c(0.1, 0.7),
    theta_mle = 7 / 20, n_success = 7, n_failure = 13,
    alg_score = bin_alg_score, alg_obs_info = bin_alg_obs_info)