Iterated weighted least squares estimation of the extremal index

Estimates the extremal index \(\theta\) using the iterated weighted least squares method of Suveges (2007). At the moment no estimates of uncertainty are provided.

Usage

iwls(data, u, maxit = 100)

Arguments

data: A numeric vector of raw data. No missing values are allowed.
u: A numeric scalar. Extreme value threshold applied to data.
maxit: A numeric scalar. The maximum number of iterations.

Value

An object (a list) of class "iwls", "exdex" containing

theta: The estimate of \(\theta\).
conv: A convergence indicator: 0 indicates successful convergence; 1 indicates that maxit has been reached.
niter: The number of iterations performed.
n_gaps: The number of time gaps between successive exceedances.
call: The call to iwls.

Details

The iterated weighted least squares algorithm on page 46 of Suveges (2007) is used to estimate the value of the extremal index. This approach uses the time gaps between successive exceedances in the data data of the threshold u. The \(i\)th gap is defined as \(T_i - 1\), where \(T_i\) is the difference in the occurrence time of exceedance \(i\) and exceedance \(i + 1\). Therefore, threshold exceedances at adjacent time points produce a gap of zero.

The model underlying this approach is an exponential-point mas mixture for scaled gaps, that is, gaps multiplied by the proportion of values in data that exceed u. Under this model scaled gaps are zero (`within-cluster' inter-exceedance times) with probability \(1 - \theta\) and otherwise (`between-cluster' inter-exceedance times) follow an exponential distribution with mean \(1 / \theta\). The estimation method is based on fitting the `broken stick' model of Ferro (2003) to an exponential quantile-quantile plot of all of the scaled gaps. Specifically, the broken stick is a horizontal line and a line with gradient \(1 / \theta\) which intersect at \((-\log\theta, 0)\). The algorithm on page 46 of Suveges (2007) uses a weighted least squares minimization applied to the exponential part of this model to seek a compromise between the role of \(\theta\) as the proportion of inter-exceedance times that are between-cluster and the reciprocal of the mean of an exponential distribution for these inter-exceedance times. The weights (see Ferro (2003)) are based on the variances of order statistics of a standard exponential sample: larger order statistics have larger sampling variabilities and therefore receive smaller weight than smaller order statistics.

Note that in step (1) of the algorithm on page 46 of Suveges there is a typo: \(N_c + 1\) should be \(N\), where \(N\) is the number of threshold exceedances. Also, the gaps are scaled as detailed above, not by their mean.

References

Suveges, M. (2007) Likelihood estimation of the extremal index. Extremes, 10, 41-55. doi:10.1007/s10687-007-0034-2

Ferro, C.A.T. (2003) Statistical methods for clusters of extreme values. Ph.D. thesis, Lancaster University.

Examples

### S&P 500 index

u <- quantile(sp500, probs = 0.60)
theta <- iwls(sp500, u)
theta
#> 
#> Call:
#> iwls(data = sp500, u = u)
#> 
#> Convergence (0 means success): 0 
#> 
#> Estimate of the extremal index theta:
#> [1]  0.7707
coef(theta)
#> [1] 0.770706
nobs(theta)
#> [1] 2899

### Newlyn sea surges

u <- quantile(newlyn, probs = 0.90)
theta <- iwls(newlyn, u)
theta
#> 
#> Call:
#> iwls(data = newlyn, u = u)
#> 
#> Convergence (0 means success): 0 
#> 
#> Estimate of the extremal index theta:
#> [1]  0.2514