A movie to illustrate the nature of the Wald, Wilks and score likelihood-based test statistics, for a model with a scalar unknown parameter θ. The user can change the value of the parameter under a simple null hypothesis and observe the effect on the test statistics and (approximate) p-values associated with the tests of this hypothesis against the general alternative. The user can specify their own log-likelihood or use one of two in-built examples.
wws(
model = c("norm", "binom"),
theta_range = NULL,
...,
mult = 3,
theta0 = if (!is.null(theta_range)) sum(c(0.25, 0.75) * theta_range) else NULL,
panel_plot = TRUE,
hscale = NA,
vscale = hscale,
delta_theta0 = if (!is.null(theta_range)) abs(diff(theta_range))/20 else NULL,
theta_mle = NULL,
loglik = NULL,
alg_score = NULL,
alg_obs_info = NULL,
digits = 3
)
A character scalar. Name of the the distribution on which one of two in-built examples are based.
If model = "norm"
then the setting is a random sample of
size n
from a normal distribution with unknown mean
mu
= θ and known standard deviation sigma
.
If model = "binom"
then the setting is a random sample from a
Bernoulli distribution with unknown success probability
θ.
The behaviour of these examples can be controlled using arguments
supplied via ...
. In particular, the data can be supplied
using data
. If model = "norm"
then n
, mu
,
and sigma
can also be chosen.
The default cases for these examples are:
model = "norm"
: n
= 10, mu
= 0,
sigma
= 1 and data
contains a sample of
a sample of size n
simulated, using
Normal
, from a normal distribution with mean
mu
and standard deviation sigma
.
model = "binom"
: data = c(7, 13)
, that is,
7 successes and 13 failures observed in 20 trials. For the purposes
of this movie there must be at least one success and at least one
failure.
A numeric vector of length 2. The range of values of
θ over which to plot the log-likelihood.
If theta_range
is not supplied then the argument mult
is used to set the range automatically.
Additional arguments to be passed to loglik
,
alg_score
and alg_obs_info
if loglik
is supplied,
or to functions functions relating to the in-built examples otherwise.
See the description of model
above for details.
A positive numeric scalar. If theta_range
is not
supplied then an interval of width 2 x mult
standard errors centred
on theta_mle
is used. If model = "binom"
then
theta_range
is truncated to (0,1) if necessary.
A numeric scalar. The value of θ under the null hypothesis to use at the start of the movie.
A logical parameter that determines whether the plot
is placed inside the panel (TRUE
) or in the standard graphics
window (FALSE
). If the plot is to be placed inside the panel
then the tkrplot library is required.
Numeric scalars. Scaling parameters for the size
of the plot when panel_plot = TRUE
. The default values are 1.4 on
Unix platforms and 2 on Windows platforms.
A numeric scalar. The amount by which the value of
theta0
is increased (or decreased) after one click of the + (or -)
button in the parameter window.
A numeric scalar. The user may use this to supply the
value of the maximum likelihood estimate (MLE) of θ.
Otherwise, optim
is used to search for the MLE,
using theta0
as the initial value and theta_range
as
bounds within which to search.
An R function, vectorised with respect to its first argument, that returns the value of the log-likelihood (up to an additive constant). The movie will not work if the observed information is not finite at the maximum likelihood estimate.
A R function that returns the score function, that is,
the derivative of loglik
with respect to θ.
A R function that returns the observed information
that is, the negated second derivative of loglik
with respect
to θ.
An integer indicating the number of significant digits to
be used in the displayed values of the test statistics and
p-values. See signif
.
Nothing is returned, only the animation is produced.
The Wald, Wilks (or likelihood ratio) and Score tests are asymptotically equivalent tests of a simple hypothesis that a parameter of interest θ is equal to a particular value θ0. The test statistics are all based on the log-likelihood l(θ for θ but they differ in the way that they measure the distance between the maximum likelihood estimate (MLE) of θ and θ0. The Wilks statistic is the amount by which the log-likelihood evaluated θ0 is smaller than the log-likelihood evaluated at the MLE. The Walk statistics is based on the absolute difference between the MLE and θ0. The score test is based on the gradient of the log-likelihood (the score function) at θ0. For details see Azzalini (1996).
This movie illustrates the differences between the test
statistics for simple models with a single scalar parameter.
In the (default) normal example the three test statistics coincide.
This is not true in general, as shown by the other in-built example
(distn
= "binom").
A user-supplied log-likelihood can be provided via loglik
.
Azzalini, A. (1996) Statistical Inference Based on the Likelihood, Chapman & Hall / CRC, London.
# N(theta, 1) example, test statistics equivalent
wws(theta0 = 0.8)
# binomial(20, theta) example, test statistics similar
wws(theta0 = 0.5, model = "binom")
# binomial(20, theta) example, test statistic rather different
# for theta0 distant from theta_mle
wws(theta0 = 0.9, model = "binom", data = c(19, 1), theta_range = c(0.1, 0.99))
# binomial(2000, theta) example, test statistics very similar
wws(theta0 = 0.5, model = "binom", data = c(1000, 1000))
set.seed(47)
x <- rnorm(10)
wws(theta0 = 0.2, model = "norm", theta_range = c(-1, 1))
# Log-likelihood for a binomial experiment (up to an additive constant)
bin_loglik <- function(p, n_success, n_failure) {
return(n_success * log(p) + n_failure * log(1 - p))
}
wws(loglik = bin_loglik, theta0 = 0.5, theta_range = c(0.1, 0.7),
theta_mle = 7 / 20, n_success = 7, n_failure = 13)
bin_alg_score <- function(p, n_success, n_failure) {
return(n_success / p - n_failure / (1 - p))
}
bin_alg_obs_info <- function(p, n_success, n_failure) {
return(n_success / p ^ 2 + n_failure / (1 - p) ^ 2)
}
wws(loglik = bin_loglik, theta0 = 0.5, theta_range = c(0.1, 0.7),
theta_mle = 7 / 20, n_success = 7, n_failure = 13,
alg_score = bin_alg_score, alg_obs_info = bin_alg_obs_info)