What does anscombiser do?

Anscombe’s quartet are a set of four two-variable datasets that have several common summary statistics (essentially means, variances and correlation) but which have very different joint distributions. This becomes apparent when the data are plotted, which illustrates the importance of using graphical displays in Statistics. The anscombiser package provides a quick and easy way to create several datasets that have common values for Anscombe’s summary statistics but display very different behaviour when plotted. It does this by transforming (shifting, scaling and rotating) the dataset to achieve target summary statistics.

An example

The mimic() function transforms an input dataset (dino below left) so that it has the same values of Anscombe’s summary statistics as another dataset (trump below right).

library(anscombiser)
library(datasauRus)
dino <- datasaurus_dozen_wide[, c("dino_x", "dino_y")]
new_dino <- mimic(dino, trump)
plot(new_dino, legend_args = list(x = "topright"))
plot(new_dino, input = TRUE, legend_args = list(x = "bottomright"), pch = 20)

In this example these images had similar summary statistics from the outset and therefore the appearance of the dino dataset has changed little. Otherwise, the first dataset will be deformed but its general shape will still be recognisable.

The rotation applied to the input dataset is not unique. The function mimic (and a function anscombise that is specific to Anscombe’s quartet) has an argument idempotent that controls how the rotation is performed. In the special case where the input dataset already has the desired summary statistics, using idempotent = TRUE ensures that the output dataset is the same as the input dataset.

Installation

To get the current released version from CRAN:

install.packages("anscombiser")

Vignette

See vignette("intro-to-anscombiser", package = "anscombiser") for an overview of the package.