A movie to compare the sampling distributions of the sample mean and sample median based on a random sample of size \(n\) from either a standard normal distribution or a standard Student's \(t\) distribution. An interesting comparison is between the normal and Student t with 2 degrees of freedom cases (see Examples).
mean_vs_median(
n = 10,
t_df = NULL,
panel_plot = TRUE,
hscale = NA,
vscale = hscale,
n_add = 1,
delta_n = 1,
arrow = TRUE,
leg_cex = 1.75,
...
)
An integer scalar. The size of the samples drawn from a standard normal distribution.
A positive scalar. The degrees of freedom df
of
a Student t distribution, as in TDist
.
If t_df
is not supplied then data are simulated from a standard
normal distribution.
A logical parameter that determines whether the plot
is placed inside the panel (TRUE
) or in the standard graphics
window (FALSE
). If the plot is to be placed inside the panel
then the tkrplot library is required.
Numeric scalars. Scaling parameters for the size
of the plot when panel_plot = TRUE
. The default values are 1.4 on
Unix platforms and 2 on Windows platforms.
An integer scalar. The number of simulated datasets to add to each new frame of the movie.
A numeric scalar. The amount by which n is increased (or decreased) after one click of the + (or -) button in the parameter window.
A logical scalar. Should an arrow be included to show the simulated sample maximum from the top plot being placed into the bottom plot?
The argument cex
to legend
.
Allows the size of the legend to be controlled manually.
Additional arguments to the rpanel functions
rp.button
and
rp.doublebutton
, not including panel
,
variable
, title
, step
, action
, initval
,
range
.
Nothing is returned, only the animation is produced.
The movie is based on simulating repeatedly samples of size
n
from either a standard normal N(0,1) distribution or a standard
Student t distribution. The latter is selected by supplying the degrees
of freedom of this distribution, using t_df
. The movie contains
three plots. The top plot contains a histogram of the most recently
simulated dataset, with the relevant probability density function (p.d.f.)
superimposed. A rug
is added to a histogram
provided that it contains no more than 1000 points.
Each time a sample is simulated the sample mean and sample median are
calculated. These values are indicated on the top plot using an
arrow (if arrow = TRUE
) or a vertical (rug) line on the horizontal
axis (arrow = FALSE
), coloured red for the sample mean and blue for
the sample median.
If arrow = TRUE
then the arrows show the positionings of most
recent mean and median in the two plots below. If arrow = FALSE
then the rug lines are replicated in these plots.
The plot in the middle contains a histogram of
the sample means of all the simulated samples.
The plot on the bottom contains a histogram of
the sample medians of all the simulated samples.
A rug
is added to these histograms
provided that they contains no more than 1000 points.
Once it starts, three aspects of this movie are controlled by the user.
There are buttons to increase (+) or decrease (-) the sample size, that is, the number of values over which a maximum is calculated.
Each time the button labelled "simulate another n_add
samples of size n" is clicked n_add
new samples are simulated
and their sample mean are added to the bottom histogram.
For the N(0,1) case only, there is a checkbox to add to the
bottom plot the p.d.f.s of the distribution of the sample mean and
the (approximate, large n
) distribution of the sample median.
# Sampling from a standard normal distribution
mean_vs_median()
# Sampling from a standard t(2) distribution
mean_vs_median(t_df = 2)