Initial estimates for Generalized Pareto parameters

Calculates initial estimates and estimated standard errors (SEs) for the generalized Pareto parameters \(\sigma\) and \(\xi\) based on an assumed random sample from this distribution. Also, calculates initial estimates and estimated standard errors for \(\phi\)₁ = \(\sigma\) and \(\phi\)₂ = \(\xi + \sigma x\) _(m), where \(x\)_(m) is the sample maximum threshold exceedance.

gpd_init(gpd_data, m, xm, sum_gp = NULL, xi_eq_zero = FALSE, init_ests = NULL)

Arguments

gpd_data: A numeric vector containing positive sample values.
m: A numeric scalar. The sample size, i.e., the length of gpd_data.
xm: A numeric scalar. The sample maximum.
sum_gp: A numeric scalar. The sum of the sample values.
xi_eq_zero: A logical scalar. If TRUE assume that the shape parameter \(\xi = 0\).
init_ests: A numeric vector. Initial estimate of \(\theta = (\sigma, \xi)\). If supplied gpd_init() returns the corresponding initial estimate of \(\phi\) = (\(\phi\)₁, \(\phi\)₂).

Value

If init_ests is not supplied by the user, a list is returned with components

init: A numeric vector. Initial estimates of \(\sigma\) and \(\xi\).
se: A numeric vector. Estimated standard errors of \(\sigma\) and \(\xi\).
init_phi: A numeric vector. Initial estimates of \(\phi\)₁ = \(\sigma\) and \(\phi\)₂ = \(\xi + \sigma x\) _(m) where \(x\)_(m) is the maximum of gpd_data.
se_phi: A numeric vector. Estimated standard errors of \(\phi\)₁ and \(\phi\)₁.

If init_ests is supplied then only the numeric vector

init_phi is returned.

Details

The main aim is to calculate an admissible estimate of \(\theta\), i.e., one at which the log-likelihood is finite (necessary for the posterior log-density to be finite) at the estimate, and associated estimated SEs. These are converted into estimates and SEs for \(\phi\). The latter can be used to set values of min_phi and max_phi for input to find_lambda.

In the default setting (xi_eq_zero = FALSE and init_ests = NULL) the methods tried are Maximum Likelihood Estimation (MLE) (Grimshaw, 1993), Probability-Weighted Moments (PWM) (Hosking and Wallis, 1987) and Linear Combinations of Ratios of Spacings (LRS) (Reiss and Thomas, 2007, page 134) in that order.

For \(\xi < -1\) the likelihood is unbounded, MLE may fail when \(\xi\) is not greater than \(-0.5\) and the observed Fisher information for \((\sigma, \xi)\) has finite variance only if \(\xi > -0.25\). We use the ML estimate provided that the estimate of \(\xi\) returned from gpd_mle is greater than \(-1\). We only use the SE if the MLE of \(\xi\) is greater than \(-0.25\).

If either the MLE or the SE are not OK then we try PWM. We use the PWM estimate only if is admissible, and the MLE was not OK. We use the PWM SE, but this will be c(NA, NA) if the PWM estimate of \(\xi\) is \(> 1/2\). If the estimate is still not OK then we try LRS. As a last resort, which will tend to occur only when \(\xi\) is strongly negative, we set \(\xi = -1\) and estimate sigma conditional on this.

References

Grimshaw, S. D. (1993) Computing Maximum Likelihood Estimates for the Generalized Pareto Distribution. Technometrics, 35(2), 185-191. and Computing (1991) 1, 129-133. doi:10.1007/BF01889987 .

Hosking, J. R. M. and Wallis, J. R. (1987) Parameter and Quantile Estimation for the Generalized Pareto Distribution. Technometrics, 29(3), 339-349. doi:10.2307/1269343 .

Reiss, R.-D., Thomas, M. (2007) Statistical Analysis of Extreme Values with Applications to Insurance, Finance, Hydrology and Other Fields.Birkhauser. doi:10.1007/978-3-7643-7399-3 .

Examples

# \donttest{
# Sample data from a GP(sigma, xi) distribution
gpd_data <- rgpd(m = 100, xi = 0, sigma = 1)
# Calculate summary statistics for use in the log-likelihood
ss <- gpd_sum_stats(gpd_data)
# Calculate initial estimates
do.call(gpd_init, ss)
#> $init
#> [1]  1.04553019 -0.03455597
#> 
#> $se
#>   sigma[u]         xi 
#> 0.14393678 0.09468163 
#> 
#> $init_phi
#> [1] 1.0455302 0.1476632
#> 
#> $se_phi
#> [1] 0.14393678 0.07877361
#> 
# }