A movie to illustrate the ideas of the sampling distribution of the sample 100\(p\)% quantile and the central limit theorem for sample quantiles.
cltq(
n = 20,
p = 0.5,
distn,
params = list(),
type = 7,
panel_plot = TRUE,
hscale = NA,
vscale = hscale,
n_add = 1,
delta_n = 1,
arrow = TRUE,
leg_cex = 1.25,
...
)
An integer scalar. The size of the samples drawn from the
distribution chosen using distn
.
A numeric scalar in (0, 1). The value of \(p\).
A character scalar specifying the (continuous) distribution
from which observations are sampled. Distributions "beta"
,
"chisq"
, "chi-squared"
,
"exponential"
, "f"
, "gamma"
,
"gev"
, "gp"
, "lognormal"
,
"log-normal"
, "normal"
,
"t"
, "uniform"
and "weibull"
are
recognised, case being ignored.
If distn
is not supplied then distn = "exponential"
is used.
The "gev"
and "gp"
cases use the
gev
and gp
distributional functions in the
revdbayes
package.
The other cases use the distributional functions in the
stats-package
.
If distn = "gamma"
then the (shape, rate)
parameterisation is used. If scale
is supplied via params
then rate
is inferred from this.
If distn = "beta"
then ncp
is forced to be zero.
A named list of additional arguments to be passed to the
density function associated with distribution distn
.
The (shape, rate)
parameterisation is used for the gamma
distribution (see GammaDist
) even if the value of
the scale
parameter is set using params
.
If a parameter value is not supplied then the default values in the
relevant distributional function set using distn
are used,
except for
"beta"
(shape1 = 2, shape2 = 2
),
"chisq"
(df = 4
),
"f"
(df1 = 4, df2 = 8
),
"gev"
(shape = 0.2
).
"gamma"
(shape = 2
,
"gp"
(shape = 0.1
),
"t"
(df = 4
) and
"weibull"
(shape = 2
).
An integer between 1 and 9. The value of the argument
type
to be passed to quantile
to when
calculating a sample quantile.
A logical parameter that determines whether the plot
is placed inside the panel (TRUE
) or in the standard graphics
window (FALSE
). If the plot is to be placed inside the panel
then the tkrplot library is required.
Numeric scalars. Scaling parameters for the size
of the plot when panel_plot = TRUE
. The default values are 1.4 on
Unix platforms and 2 on Windows platforms.
An integer scalar. The number of simulated datasets to add to each new frame of the movie.
A numeric scalar. The amount by which n is increased (or decreased) after one click of the + (or -) button in the parameter window.
A logical scalar. Should an arrow be included to show the simulated sample quantile from the top plot being placed into the bottom plot?
The argument cex
to legend
.
Allows the size of the legend to be controlled manually.
Additional arguments to the rpanel functions
rp.button
and
rp.doublebutton
, not including panel
,
variable
, title
, step
, action
, initval
,
range
.
Nothing is returned, only the animation is produced.
Loosely speaking, a consequence of the CLT for sample quantiles is that the 100\(p\)% sample quantile of a large number of identically distributed random variables, each with probability density function \(f\) and 100\(p\)% quantile \(\xi(p)\), has approximately a normal distribution. See, for example, Lehmann (1999) for a precise statement and conditions.
This movie considers examples where this limiting result holds and illustrates graphically the closeness of the limiting approximation provided by the relevant normal limit to the true finite-\(n\) distribution.
Samples of size n
are repeatedly simulated from the distribution
chosen using distn
. These samples are summarized using a plot
that appears at the top of the movie screen. For each sample the
100\(p\)% sample quantile of these n
values is calculated,
stored and added to another plot, situated below the first plot.
This plot is either a histogram or an empirical c.d.f., chosen using a
radio button.
A rug
is added to a histogram provided that it
contains no more than 1000 points.
The p.d.f. of the original variables is added to the top plot.
Once it starts, four aspects of this movie are controlled by the user.
There are buttons to increase (+) or decrease (-) the sample size, that is, the number of values for which a sample quantile is calculated.
Each time the button labelled "simulate another n_add
samples of size n" is clicked n_add
new samples are simulated
and their sample quantile are added to the bottom histogram.
There is a button to switch the bottom plot from displaying a histogram of the simulated sample quantiles and the limiting normal p.d.f. to the empirical c.d.f. of the simulated data and the limiting normal c.d.f.
There is a checkbox to add to the bottom plot the approximate (large \(n\)) normal p.d.f./c.d.f. implied by the CLT for sample quantiles: the mean is equal to \(\xi(p)\) and standard deviation is equal to \(\sqrt p \sqrt q / n f(\xi(p))\), where \(q = 1-p\).
Lehman, E. L. (1999) Elements of Large-Sample Theory, Springer-Verlag, London. doi:10.1007/b98855
# Exponential data
cltq()
# Uniform data
cltq(distn = "t", params = list(df = 2))