A movie to illustrate the nature of the Wald, Wilks and score likelihood-based test statistics, for a model with a scalar unknown parameter \(\theta\). The user can change the value of the parameter under a simple null hypothesis and observe the effect on the test statistics and (approximate) p-values associated with the tests of this hypothesis against the general alternative. The user can specify their own log-likelihood or use one of two in-built examples.
Usage
wws(
model = c("norm", "binom"),
theta_range = NULL,
...,
mult = 3,
theta0 = if (!is.null(theta_range)) sum(c(0.25, 0.75) * theta_range) else NULL,
panel_plot = TRUE,
hscale = NA,
vscale = hscale,
delta_theta0 = if (!is.null(theta_range)) abs(diff(theta_range))/20 else NULL,
theta_mle = NULL,
loglik = NULL,
alg_score = NULL,
alg_obs_info = NULL,
digits = 3
)Arguments
- model
A character scalar. Name of the the distribution on which one of two in-built examples are based.
If
model = "norm"then the setting is a random sample of sizenfrom a normal distribution with unknown meanmu= \(\theta\) and known standard deviationsigma.If
model = "binom"then the setting is a random sample from a Bernoulli distribution with unknown success probability \(\theta\).The behaviour of these examples can be controlled using arguments supplied via
.... In particular, the data can be supplied usingdata. Ifmodel = "norm"thenn,mu, andsigmacan also be chosen. The default cases for these examples are:model = "norm":n= 10,mu= 0,sigma= 1 anddatacontains a sample of a sample of sizensimulated, usingNormal, from a normal distribution with meanmuand standard deviationsigma.model = "binom":data = c(7, 13), that is, 7 successes and 13 failures observed in 20 trials. For the purposes of this movie there must be at least one success and at least one failure.
- theta_range
A numeric vector of length 2. The range of values of \(\theta\) over which to plot the log-likelihood. If
theta_rangeis not supplied then the argumentmultis used to set the range automatically.- ...
Additional arguments to be passed to
loglik,alg_scoreandalg_obs_infoifloglikis supplied, or to functions functions relating to the in-built examples otherwise. See the description ofmodelabove for details.- mult
A positive numeric scalar. If
theta_rangeis not supplied then an interval of width 2 xmultstandard errors centred ontheta_mleis used. Ifmodel = "binom"thentheta_rangeis truncated to (0,1) if necessary.- theta0
A numeric scalar. The value of \(\theta\) under the null hypothesis to use at the start of the movie.
- panel_plot
A logical parameter that determines whether the plot is placed inside the panel (
TRUE) or in the standard graphics window (FALSE). If the plot is to be placed inside the panel then the tkrplot library is required.- hscale, vscale
Numeric scalars. Scaling parameters for the size of the plot when
panel_plot = TRUE. The default values are 1.4 on Unix platforms and 2 on Windows platforms.- delta_theta0
A numeric scalar. The amount by which the value of
theta0is increased (or decreased) after one click of the + (or -) button in the parameter window.- theta_mle
A numeric scalar. The user may use this to supply the value of the maximum likelihood estimate (MLE) of \(\theta\). Otherwise,
optimis used to search for the MLE, usingtheta0as the initial value andtheta_rangeas bounds within which to search.- loglik
An R function, vectorised with respect to its first argument, that returns the value of the log-likelihood (up to an additive constant). The movie will not work if the observed information is not finite at the maximum likelihood estimate.
- alg_score
A R function that returns the score function, that is, the derivative of
loglikwith respect to \(\theta\).- alg_obs_info
A R function that returns the observed information that is, the negated second derivative of
loglikwith respect to \(\theta\).- digits
An integer indicating the number of significant digits to be used in the displayed values of the test statistics and p-values. See
signif.
Details
The Wald, Wilks (or likelihood ratio) and Score tests are asymptotically equivalent tests of a simple hypothesis that a parameter of interest \(\theta\) is equal to a particular value \(\theta_0\). The test statistics are all based on the log-likelihood \(l(\theta\) for \(\theta\) but they differ in the way that they measure the distance between the maximum likelihood estimate (MLE) of \(\theta\) and \(\theta_0\). The Wilks statistic is the amount by which the log-likelihood evaluated \(\theta_0\) is smaller than the log-likelihood evaluated at the MLE. The Walk statistics is based on the absolute difference between the MLE and \(\theta_0\). The score test is based on the gradient of the log-likelihood (the score function) at \(\theta_0\). For details see Azzalini (1996).
This movie illustrates the differences between the test
statistics for simple models with a single scalar parameter.
In the (default) normal example the three test statistics coincide.
This is not true in general, as shown by the other in-built example
(distn = "binom").
A user-supplied log-likelihood can be provided via loglik.
References
Azzalini, A. (1996) Statistical Inference Based on the Likelihood, Chapman & Hall / CRC, London.
Examples
# N(theta, 1) example, test statistics equivalent
wws(theta0 = 0.8)
# binomial(20, theta) example, test statistics similar
wws(theta0 = 0.5, model = "binom")
# binomial(20, theta) example, test statistic rather different
# for theta0 distant from theta_mle
wws(theta0 = 0.9, model = "binom", data = c(19, 1), theta_range = c(0.1, 0.99))
# binomial(2000, theta) example, test statistics very similar
wws(theta0 = 0.5, model = "binom", data = c(1000, 1000))
set.seed(47)
x <- rnorm(10)
wws(theta0 = 0.2, model = "norm", theta_range = c(-1, 1))
# Log-likelihood for a binomial experiment (up to an additive constant)
bin_loglik <- function(p, n_success, n_failure) {
return(n_success * log(p) + n_failure * log(1 - p))
}
wws(loglik = bin_loglik, theta0 = 0.5, theta_range = c(0.1, 0.7),
theta_mle = 7 / 20, n_success = 7, n_failure = 13)
bin_alg_score <- function(p, n_success, n_failure) {
return(n_success / p - n_failure / (1 - p))
}
bin_alg_obs_info <- function(p, n_success, n_failure) {
return(n_success / p ^ 2 + n_failure / (1 - p) ^ 2)
}
wws(loglik = bin_loglik, theta0 = 0.5, theta_range = c(0.1, 0.7),
theta_mle = 7 / 20, n_success = 7, n_failure = 13,
alg_score = bin_alg_score, alg_obs_info = bin_alg_obs_info)