Maximum likelihood estimation for the \(K\)-gaps model

Calculates maximum likelihood estimates of the extremal index \(\theta\) based on the \(K\)-gaps model for threshold inter-exceedances times of Suveges and Davison (2010).

kgaps(data, u, k = 1, inc_cens = TRUE)

Arguments

data

A numeric vector or numeric matrix of raw data. If data is a matrix then the log-likelihood is constructed as the sum of (independent) contributions from different columns. A common situation is where each column relates to a different year.

If data contains missing values then split_by_NAs is used to divide the data further into sequences of non-missing values, stored in different columns in a matrix. Again, the log-likelihood is constructed as a sum of contributions from different columns.

u

A numeric scalar. Extreme value threshold applied to data.

k

A non-negative numeric scalar. Run parameter \(K\), as defined in Suveges and Davison (2010). Threshold inter-exceedances times that are not larger than k units are assigned to the same cluster, resulting in a \(K\)-gap equal to zero. Specifically, the \(K\)-gap \(S\) corresponding to an inter-exceedance time of \(T\) is given by \(S = \max(T - K, 0)\). In practice, \(k\) should be no smaller than 1, because when \(k\) is less than 1 the estimate of \(\theta\) is always equal to 1.

inc_cens

A logical scalar indicating whether or not to include contributions from right-censored inter-exceedance times, relating to the first and last observations. It is known that these times are greater than or equal to the time observed. See Attalides (2015) for details. If data has multiple columns then there will be right-censored first and last inter-exceedance times for each column.

Value

An object (a list) of class c("kgaps", "exdex") containing

theta: The maximum likelihood estimate (MLE) of \(\theta\).
se: The estimated standard error of the MLE, calculated using an algebraic expression for the observed information. If k = 0 then se is returned as 0.
se_exp: The estimated standard error of the MLE, calculated using an algebraic expression for the expected information. If the estimate of \(\theta\) is 0 or 1 then se_exp is NA.
ss: The list of summary statistics returned from kgaps_stat.
k, u, inc_cens: The input values of k, u and inc_cens.
max_loglik: The value of the log-likelihood at the MLE.
call: The call to kgaps.

Details

If inc_cens = FALSE then the maximum likelihood estimate of the extremal index \(\theta\) under the \(K\)-gaps model of Suveges and Davison (2010) is calculated.

If inc_cens = TRUE then information from right-censored first and last inter-exceedance times is also included in the likelihood to be maximized, following Attalides (2015). The form of the log-likelihood is given in the Details section of kgaps_stat.

It is possible that the estimate of \(\theta\) is equal to 1, and also possible that it is equal to 0. kgaps_stat explains the respective properties of the data that cause these events to occur.

References

Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292

Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf

Examples

### S&P 500 index

u <- quantile(sp500, probs = 0.60)
theta <- kgaps(sp500, u)
theta
#> 
#> Call:
#> kgaps(data = sp500, u = u)
#> 
#> Estimate of the extremal index theta:
#>  theta  
#> 0.6953  
summary(theta)
#> 
#> Call:
#> kgaps(data = sp500, u = u)
#> 
#>       Estimate Std. Error
#> theta   0.6953   0.007234
coef(theta)
#>     theta 
#> 0.6953391 
nobs(theta)
#> [1] 2901
vcov(theta)
#>              theta
#> theta 5.232578e-05
logLik(theta)
#> 'log Lik.' -3811.894 (df=1)

### Newlyn sea surges

u <- quantile(newlyn, probs = 0.60)
theta <- kgaps(newlyn, u, k = 2)
theta
#> 
#> Call:
#> kgaps(data = newlyn, u = u, k = 2)
#> 
#> Estimate of the extremal index theta:
#>  theta  
#> 0.1758  
summary(theta)
#> 
#> Call:
#> kgaps(data = newlyn, u = u, k = 2)
#> 
#>       Estimate Std. Error
#> theta   0.1758   0.009211

### Cheeseboro wind gusts

theta <- kgaps(cheeseboro, 45, k = 3)
theta
#> 
#> Call:
#> kgaps(data = cheeseboro, u = 45, k = 3)
#> 
#> Estimate of the extremal index theta:
#>  theta  
#> 0.2405  
summary(theta)
#> 
#> Call:
#> kgaps(data = cheeseboro, u = 45, k = 3)
#> 
#>       Estimate Std. Error
#> theta   0.2405    0.02336