Likelihood-Based Inference for Time Series Extremes

The lite package performs likelihood-based inference for stationary time series extremes. The general approach follows Fawcett and Walshaw (2012). There are 3 independent parts to the inference.

A Bernoulli (p_u) model for whether a given observation exceeds the threshold u.
A generalised Pareto, GP (σ_u,ξ), model for the marginal distribution of threshold excesses.
The K-gaps model for the extremal index θ, based on inter-exceedance times.

For parts 1 and 2 it is necessary to adjust the inferences because we expect that the data will exhibit cluster dependence. This is achieved using the methodology developed in Chandler and Bate (2007) to produce a log-likelihood that is adjusted for this dependence. This is achieved using the chandwich package. For part 3, the methodology described in Süveges and Davison (2010) is used, implemented by the function kgaps in the exdex package. The (adjusted) log-likelihoods from parts 1, 2 and 3 are combined to make inferences about return levels.

We illustrate the main functions in lite using the cheeseboro wind gusts data from the exdex package, which contains hourly wind gust data from each January over the 10-year period 2000-2009.

Frequentist inference

The function flite makes frequentist inferences about (p_u,σ_u,ξ,θ) using maximum likelihood estimation. First, we make inferences about the model parameters.

library(lite)
cdata <- exdex::cheeseboro
# Each column of the matrix cdata corresponds to data from a different year
# flite() sets cluster automatically to correspond to column (year)
cfit <- flite(cdata, u = 45, k = 3)
summary(cfit)
#> 
#> Call:
#> flite(data = cdata, u = 45, k = 3)
#> 
#>          Estimate Std. Error
#> p[u]      0.02771   0.005988
#> sigma[u]  9.27400   2.071000
#> xi       -0.09368   0.084250
#> theta     0.24050   0.023360

Then, we make inferences about the 100-year return level, including 95% confidence intervals. The argument ny sets the number of observations per year, which is 31 × 24 = 744 for these data.

rl <- returnLevel(cfit, m = 100, level = 0.95, ny = 31 * 24)
rl
#> 
#> Call:
#> returnLevel(x = cfit, m = 100, level = 0.95, ny = 31 * 24)
#> 
#> MLE and 95% confidence limits for the 100-year return level
#> 
#> Normal interval:
#>  lower     mle   upper  
#>  69.80   88.62  107.44  
#> Profile likelihood-based interval:
#>  lower     mle   upper  
#>  75.89   88.62  125.43

Bayesian inference

The function blite performs Bayesian inferences about (p_u,σ_u,ξ,θ), based on a likelihood constructed from the (adjusted) log-likelihoods detailed above. First, we sample from the posterior distribution of the parameters, using the default priors.

cpost <- blite(cdata, u = 45, k = 3, ny = 31 * 24)
summary(cpost)
#> 
#> Call:
#> blite(data = cdata, u = 45, k = 3, ny = 31 * 24)
#> 
#>          Posterior mean Posterior SD
#> p[u]            0.02832     0.006008
#> sigma[u]       10.03000     2.435000
#> xi             -0.07196     0.094030
#> theta           0.24250     0.024040

Then, we estimate a 95% highest predictive density (HPD) interval for the largest value M₁₀₀ to be observed over a future time period of length 100 years.

predict(cpost, hpd = TRUE, n_years = 100)$short
#>         lower    upper n_years level
#> [1,] 73.09008 139.8616     100    95

Objects returned from flite and blite have plot methods to summarise graphically, respectively, log-likelihoods and posterior distributions.

Installation

To get the current released version from CRAN:

install.packages("lite")

Vignettes

See vignette("lite-1-frequentist", package = "lite") and vignette("lite-2-bayesian", package = "lite").

lite