Predictive inference for the largest value observed in N years.
Source:R/predictive.R
predict.ithresh.Rd
predict
method for class "ithresh"
. Predictive inferences can
either be based on a single training threshold or using a weighted
average of inferences over multiple training thresholds. A single
threshold may chosen to be the best performing threshold, as judged by the
measure of predictive performance calculated by ithresh
or
chosen by the user. The weights used in the latter case are based on the
measures of predictive performance and prior probabilities assigned to the
training thresholds. By default all thresholds are given the same
prior probability but the user can specify their own prior.
Arguments
- object
An object of class
"ithresh"
, a result of a call toithresh
.- npy
A numeric scalar. The mean number of observations per year of data, after excluding any missing values, i.e. the number of non-missing observations divided by total number of years of non-missing data.
- n_years
A numeric vector. Value(s) of N. If
which_u = "all"
thenn_years
must have length one.- which_u
Either a character scalar or a numeric scalar. If
which_u
is a character scalar it must be either "best" or "all".If
which_u = "best"
then the single threshold achieving the largest measure of predictive performance inobject$pred_perf
, based on the validation threshold selected usingwhich_v
, is used to perform prediction. Seesummary.ithresh
to print the best thresholds for each validation threshold.If
which_u = "all"
then all the thresholds are used to perform prediction. The inferences from each threshold are weighted according to the posterior threshold weights given in equation (15) of Northrop et al. (2017) based on the prior probabilities of thresholds inu_prior
and columnwhich_v
of the measures of predictive performance inobject$pred_perf
.Otherwise,
which_u
is a numeric scalar that indicates which element ofobject$u_vec
the user wishes to select as a single threshold on which to base prediction, that is,which_u
must be an integer in1, ..., length(object$u_vec)
.- which_v
A numeric scalar. Indicates which element of
object$v_vec
is used in selecting a single threshold (ifwhich_u = "best"
) or weighting the inferences from all thresholds (ifwhich_u = "all"
). Note: the default,which_v = 1
gives the lowest of the validation thresholds inobject$v_vec
.- u_prior
A numeric vector. Prior probabilities for the training thresholds in
object$u_vec
. Only used ifwhich_u = "all"
.Only the first
length(object$u_vec) - length(object$v_vec) + which_v
elements ofu_prior
are used. This is because only training thresholds up to and includingobject$v_vec[which_v]
are relevant.u_prior
must have lengthlength(object$u_vec)
orlength(object$u_vec) - length(object$v_vec) + which_v
.If
u_prior
is not supplied then all (relevant) training thresholds are given equal prior probability.u_prior
is normalized to have sum equal to 1 insidepredict.ithresh
.- type
A character vector. Passed to
predict.evpost
. Indicates which type of inference is required:"p" for the predictive distribution function,
"d" for the predictive density function,
"q" for the predictive quantile function,
"i" for predictive intervals (see
...
to setlevel
),"r" for random generation from the predictive distribution.
If
which_u = "all"
then onlytype = "p"
ortype = "d"
are allowed.- hpd
A logical scalar. The argument
hpd
ofpredict.evpost
. Only relevant iftype = "i"
.- x
A numeric vector. The argument
x
ofpredict.evpost
. In the current context this must be a vector (not a matrix, as suggested by the documentation ofpredict.evpost
). Ifx
is not supplied then a default value is set withinpredict.evpost
.- ...
Additional arguments to be passed to
predict.evpost
. In particular:level
, which can be used to set (multiple) levels for predictive intervals whentype = "i"
;lower_tail
(relevant whentype = "p"
or"q"
) andlog
(relevant whentype = "d"
).
Value
An list object of class "ithreshpred"
with a similar
structure to an object of class "evpred"
returned from
predict.evpost
is returned invisibly.
In addition, the object contains
u_vec = object$u_vec
and v_vec = object$v_vec
,
which_v
and the index best_u
in
u_vec = object$u_vec
of the best training threshold based on
the value of which_v
.
It also contains the value of the Box-Cox transformation parameter
lambda
. This will always be equal to 1 if object
was
returned from ithresh
.
If which_u = "all"
then
the list also contains the posterior threshold weights in component
post_thresh_wts
the component
y
is a matrix withlength{x}
rows and 1 +length(object$u_vec) - length(object$v_vec) + which_v
columns. Column 1 contains the estimated predictive distribution function (type = "p"
) or density function (type = "d"
) obtained using a weighted average of the inferences over different training thresholds. The other columns contain the estimated functions for each of the training thresholds inu_vec
.
Details
The function predict.evpost
is used to
perform predictive based on the binomial-GP posterior sample generated
using a given training threshold. For mathematical details of the
single threshold and multiple threshold cases see Sections 2.3 and 3 of
Northrop et al. (2017) respectively.
References
Northrop, P. J., Attalides, N. and Jonathan, P. (2017) Cross-validatory extreme value threshold selection and uncertainty with application to ocean storm severity. Journal of the Royal Statistical Society Series C: Applied Statistics, 66(1), 93-120. doi:10.1111/rssc.12159
See also
ithresh
for threshold selection in the i.i.d. case
based on leave-one-out cross-validation.
plot.ithreshpred
for the S3 plot method for objects
of class ithreshpred
.
Examples
# Note:
#' In the examples below validation thresholds rather higher than is
# advisable have been used, with far fewer excesses than the minimum of
# 50 suggested by Jonathan and Ewans (2013).
# Gulf of Mexico significant wave heights, default priors.
u_vec_gom <- quantile(gom, probs = seq(0, 0.9, by = 0.05))
gom_cv <- ithresh(data = gom, u_vec = u_vec_gom, n_v = 3)
# Note: gom_cv$npy contains the correct value of npy (it was set in the
# call to ithresh, via attr(gom, "npy").
# If object$npy doesn't exist then the argument npy must be supplied
# in the call to predict().
### Best training threshold based on the lowest validation threshold
# Predictive distribution function
best_p <- predict(gom_cv, n_years = c(100, 1000))
plot(best_p)
# Predictive density function
best_d <- predict(gom_cv, type = "d", n_years = c(100, 1000))
plot(best_d)
# Predictive intervals
best_i <- predict(gom_cv, n_years = c(100, 1000), type = "i", hpd = TRUE,
level = c(95, 99))
plot(best_i, which_int = "both")
#> Warning: argument 1 does not name a graphical parameter
# See which threshold was used
summary(gom_cv)
#> v v quantile best u best u quantile index of u_vec
#> 1 4.6070 80 3.1598 55 12
#> 2 5.1302 85 3.6545 65 14
#> 3 5.8246 90 3.6545 65 14
### All thresholds plus weighted average of inferences over all thresholds
# Predictive distribution function
all_p <- predict(gom_cv, which_u = "all")
plot(all_p)
# Predictive density function
all_d <- predict(gom_cv, which_u = "all", type = "d")
plot(all_d)