Estimates the extremal index \(\theta\) using a semiparametric block
maxima estimator of Northrop (2015) ("N2015"
) and a variant of this
estimator studied by Berghaus and Bucher (2018) ("BB2018"
), using
both sliding (overlapping) block maxima and disjoint (non-overlapping) block
maxima. A simple modification (subtraction of \(1 / b\), where \(b\) is
the block size) of the Berghaus and Bucher (2018) estimator
("BB2018b"
) is also calculated. Estimates of uncertainty are
calculated using the asymptotic theory developed by Berghaus and Bucher
(2018).
A numeric vector of raw data. No missing values are allowed.
A numeric scalar. The block size.
A character scalar. Is bias-adjustment of the
raw estimate of \(\theta\) performed using the bias-reduced
estimator (bias_adjust = "BB3"
), derived in Section 5 of
Berghaus and Bucher (2018); or a simpler version
(bias_adjust = "BB1"
), in which the raw estimate is multiplied by
\((k-1) / k\), where \(k\) is the number of blocks; or the
bias-adjustment of the empirical distribution function used to calculate
the estimate, as detailed in Section 2 of Northrop (2015). When disjoint
maxima are used bias_adjust = "BB1"
and bias_adjust = "N"
give identical estimates of the Berghaus and Bucher (2018) variant,
as explained at the end of Section 5 of Berghaus and Bucher (2018).
If bias_adjust = "none"
then no bias-adjustment is performed.
A logical scalar. If constrain = TRUE
then
any estimates that are greater than 1 are set to 1,
that is, they are constrained to lie in (0, 1]. This is carried out
after any bias-adjustment. Otherwise,
estimates that are greater than 1 may be obtained.
A logical scalar. If varN = TRUE
then the estimation
of the sampling variance of the Northrop (2015) estimator is tailored
to that estimator. Otherwise, the sampling variance derived in
Berghaus and Bucher (2018) is used.
See Details for further information.
A character scalar. Determines which set of disjoint
maxima are used to calculate an estimate of \(\theta\): "first"
,
only the set whose first block starts on the first observation in
x
; "last"
, only the set whose last block ends on the last
observation in x
.
A list of class c("spm", "exdex")
containing the
components listed below. The components that are vectors are
labelled to indicate the estimator to which the constituent values
relate: "N2015"
for Northrop (2015), "BB2018"
for
Berghaus and Bucher (2018) and "BB2018b"
for the modified version.
Vectors containing the estimates of \(\theta\) resulting from sliding maxima and disjoint maxima respectively.
The estimated standard errors associated
with the estimates in theta_sl
and theta_dj
.
The values for "BB2018"
and "BB2018b"
are identical.
The respective values of the
bias-adjustment applied to the raw estimates, that is, the values
subtracted from the raw estimates. For estimator
BB2018b
this includes a contribution for the subtraction
of 1 / b
.
If bias_adjust = "N"
or "none"
then
bias_sl
and bias_dj
are c(0, 0, 1 / b)
.
Vectors containing the raw estimates of \(\theta\), prior to any bias_adjustment.
The (bias_adjusted) estimates of \(\theta\) before the constraint that they lie in (0, 1] has been applied.
Matrices containing the \(Y\)-data and \(Z\)-data for the sliding an disjoint maxima respectively. The first columns are the \(Y\)-data, the second columns the \(Z\)-data.
Estimates of the variance
\(\sigma_{{\rm dj}}^2\)
defined on pages 2314-2315 of Berghaus and Bucher (2018).
The form of the estimates is given on page 2319.
sigma2dj
is used in estimating the standard error se_dj
,
sigma2dj_for_sl
in estimating the standard error se_sl
.
Estimates of the variance
\(\sigma_{{\rm sl}}^2\).
defined on pages 2314-2315 of Berghaus and Bucher (2018).
The form of the estimates is given on page 2319.
sigma2dj_for_sl
is used to estimate
\(\sigma_{{\rm dj}}^2\) for this purpose.
The input value of b
.
The input value of bias_adjust
.
The call to spm
.
The extremal index \(\theta\) is estimated using the
semiparametric maxima estimator of Northrop (2015) and variant
of this studied by Berghaus and Bucher (2018). In each case a sample
of 'data' is derived from the input data data
, based on the
empirical distribution function of these data evaluated at the
maximum values of of blocks of b
contiguous values in data
.
The estimators are based on an assumption that these 'data' are sampled approximately from an exponential distribution with mean \(1/\theta\). For details see page 2309 of Berghaus and Bucher (2018), where the 'data' for the Northrop (2015) estimator are denoted \(Y\) and those for the Berghaus and Bucher (2018) are denoted \(Z\). For convenience, we will refer to these values as the \(Y\)-data and the \(Z\)-data.
The approximate nature of the model for the \(Y\)-data arises from the estimation of the distribution function \(F\). A further approximation motivates the use of the \(Z\)-data. If \(F\) is known then the variable \(Z / b\) has a beta(1, \(b\theta\)) distribution, so that that is, \(Z\) has mean \(1 / (\theta + 1/b)\). Therefore, an exponential distribution with mean \(1 / (\theta + 1/b)\) may provide a better approximate model, which provides the motivation for subtracting \(1 / b\) from the Berghaus and Bucher (2018) estimator. Indeed, the resulting estimates are typically close to those of the Northrop (2015) estimator.
If sliding = TRUE
then the function uses sliding block maxima,
that is, the largest value observed in all
(length(data) - b + 1
) blocks of b
observations.
If sliding = FALSE
then disjoint block maxima, that is, the
largest values in (floor(n / b)
) disjoint blocks of b
observations, are used.
Estimation of the sampling variances of the estimators is based
on Proposition 4.1 on page 2319 of Berghaus and Bucher (2018).
For the Northrop (2015) variant the user has the choice either to
use the sampling variance based on the Berghaus and Bucher (2018)
estimator, i.e. the \(Z\)-data (varN = FALSE
) or an analogous
version tailored to the Northrop (2015) estimator that uses the
\(Y\)-data (varN = TRUE
).
The estimates of the sampling variances of the sliding blocks estimators
are inferred from those of the disjoint blocks estimators (see page 2319
of Berhaus and Bucher (2018)). The calculation of the latter uses a set
of disjoint block maxima. If length(data)
is not an integer
multiple of b
then there will be more than one set of these, and
all are equally valid. In this event we perform the calculation for all
such sets and use the mean of the resulting estimates. This reduces the
sampling variability of the estimates at the expense of slowing down
the calculation somewhat, particularly if b
is small. This may
become apparent when calling spm
repeatedly in
choose_b
.
This estimator of the sampling variance of the sliding blocks estimator is
not constrained to be positive: a negative estimate may result if the
block size is small. In this event
no warning will be given until the returned object is printed and,
for the affected estimator ("N2015"
or "BB2018/BB2018b"
),
the corresponding estimated standard errors using sliding
blocks will be missing in se_sl
in the returned object,
if bias_adjust == "BB3"
then bias-adjustment
based on bias_adjust == "BB1"
will instead be performed
when using sliding blocks, because the former relies on the
estimated variances of the estimators.
Similarly, bias adjustment under adjust = "BB3"
and/or subtraction
of \(1 / b\) in the "BB2018b"
case may, rare cases, produce a
negative estimate of \(\theta\). In these instances an estimate of
zero is returned, but the values returned in bias_dj
and
bias_sl
are not changed.
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
spm_confint
to estimate confidence intervals
for \(\theta\).
spm_methods
for S3 methods for "spm"
objects.
### Newlyn sea surges
theta <- spm(newlyn, 20)
theta
#>
#> Call:
#> spm(data = newlyn, b = 20)
#>
#> Estimates of the extremal index theta:
#> N2015 BB2018 BB2018b
#> sliding 0.2392 0.3078 0.2578
#> disjoint 0.2350 0.3042 0.2542
summary(theta)
#>
#> Call:
#> spm(data = newlyn, b = 20)
#>
#> Estimate Std. Error Bias adj.
#> N2015, sliding 0.2392 0.01990 0.003317
#> BB2018, sliding 0.3078 0.01642 0.003026
#> BB2018b, sliding 0.2578 0.01642 0.053030
#> N2015, disjoint 0.2350 0.02222 0.003726
#> BB2018, disjoint 0.3042 0.02101 0.003571
#> BB2018b, disjoint 0.2542 0.02101 0.053570
coef(theta)
#> N2015 BB2018 BB2018b
#> 0.2392234 0.3078219 0.2578219
nobs(theta)
#> [1] 2875
vcov(theta)
#> N2015 BB2018 BB2018b
#> 0.0003960306 0.0002694754 0.0002694754
### S&P 500 index
theta <- spm(sp500, 100)
theta
#>
#> Call:
#> spm(data = sp500, b = 100)
#>
#> Estimates of the extremal index theta:
#> N2015 BB2018 BB2018b
#> sliding 0.3264 0.3359 0.3259
#> disjoint 0.3248 0.3343 0.3243
summary(theta)
#>
#> Call:
#> spm(data = sp500, b = 100)
#>
#> Estimate Std. Error Bias adj.
#> N2015, sliding 0.3264 0.03348 0.007996
#> BB2018, sliding 0.3359 0.03343 0.008026
#> BB2018b, sliding 0.3259 0.03343 0.018030
#> N2015, disjoint 0.3248 0.03836 0.009045
#> BB2018, disjoint 0.3343 0.03862 0.009114
#> BB2018b, disjoint 0.3243 0.03862 0.019110