R/predictive.R
predict.ithresh.Rd
predict
method for class "ithresh"
. Predictive inferences can
either be based on a single training threshold or using a weighted
average of inferences over multiple training thresholds. A single
threshold may chosen to be the best performing threshold, as judged by the
measure of predictive performance calculated by ithresh
or
chosen by the user. The weights used in the latter case are based on the
measures of predictive performance and prior probabilities assigned to the
training thresholds. By default all thresholds are given the same
prior probability but the user can specify their own prior.
An object of class "ithresh"
, a result of a call to
ithresh
.
A numeric scalar. The mean number of observations per year of data, after excluding any missing values, i.e. the number of non-missing observations divided by total number of years of non-missing data.
A numeric vector. Value(s) of N. If which_u = "all"
then n_years
must have length one.
Either a character scalar or a numeric scalar.
If which_u
is a character scalar it must be either "best" or "all".
If which_u = "best"
then the single threshold achieving the largest
measure of predictive performance in object$pred_perf
, based
on the validation threshold selected using which_v
, is used to
perform prediction. See summary.ithresh
to print the
best thresholds for each validation threshold.
If which_u = "all"
then all the thresholds are used to
perform prediction. The inferences from each threshold are weighted
according to the posterior threshold weights given in
equation (15) of Northrop et al. (2017) based on the prior probabilities
of thresholds in u_prior
and column which_v
of the measures
of predictive performance in object$pred_perf
.
Otherwise, which_u
is a numeric scalar that indicates which
element of object$u_vec
the user wishes to select as a single
threshold on which to base prediction, that is, which_u
must
be an integer in 1, ..., length(object$u_vec)
.
A numeric scalar. Indicates which element of
object$v_vec
is used in selecting a single threshold
(if which_u = "best"
) or weighting the inferences from
all thresholds (if which_u = "all"
).
Note: the default, which_v = 1
gives the lowest of the
validation thresholds in object$v_vec
.
A numeric vector. Prior probabilities for the training
thresholds in object$u_vec
. Only used if which_u = "all"
.
Only the first
length(object$u_vec) - length(object$v_vec) + which_v
elements of u_prior
are used. This is because only training
thresholds up to and including object$v_vec[which_v]
are relevant.
u_prior
must have length length(object$u_vec)
or
length(object$u_vec) - length(object$v_vec) + which_v
.
If u_prior
is not supplied then all (relevant) training thresholds
are given equal prior probability.
u_prior
is normalized to have sum equal to 1 inside
predict.ithresh
.
A character vector.
Passed to predict.evpost
.
Indicates which type of inference is required:
"p" for the predictive distribution function,
"d" for the predictive density function,
"q" for the predictive quantile function,
"i" for predictive intervals (see ...
to set level
),
"r" for random generation from the predictive distribution.
If which_u = "all"
then only type = "p"
or type = "d"
are allowed.
A logical scalar. The argument hpd
of
predict.evpost
. Only relevant if
type = "i"
.
A numeric vector. The argument x
of
predict.evpost
. In the current context this
must be a vector (not a matrix, as suggested by the documentation of
predict.evpost
). If x
is not supplied
then a default value is set within
predict.evpost
.
Additional arguments to be passed to
predict.evpost
. In particular:
level
, which can be used to set (multiple) levels
for predictive intervals when type = "i"
;
lower_tail
(relevant when type = "p"
or "q"
) and
log
(relevant when type = "d"
).
An list object of class "ithreshpred"
with a similar
structure to an object of class "evpred"
returned from
predict.evpost
is returned invisibly.
In addition, the object contains
u_vec = object$u_vec
and v_vec = object$v_vec
,
which_v
and the index best_u
in
u_vec = object$u_vec
of the best training threshold based on
the value of which_v
.
It also contains the value of the Box-Cox transformation parameter
lambda
. This will always be equal to 1 if object
was
returned from ithresh
.
If which_u = "all"
then
the list also contains the posterior threshold weights
in component post_thresh_wts
the component y
is a matrix with length{x}
rows
and 1 + length(object$u_vec) - length(object$v_vec) + which_v
columns. Column 1 contains the estimated predictive distribution
function (type = "p"
) or density function (type = "d"
)
obtained using a weighted average of the inferences over different
training thresholds. The other columns contain the estimated
functions for each of the training thresholds in u_vec
.
The function predict.evpost
is used to
perform predictive based on the binomial-GP posterior sample generated
using a given training threshold. For mathematical details of the
single threshold and multiple threshold cases see Sections 2.3 and 3 of
Northrop et al. (2017) respectively.
Northrop, P. J., Attalides, N. and Jonathan, P. (2017) Cross-validatory extreme value threshold selection and uncertainty with application to ocean storm severity. Journal of the Royal Statistical Society Series C: Applied Statistics, 66(1), 93-120. doi:10.1111/rssc.12159
ithresh
for threshold selection in the i.i.d. case
based on leave-one-out cross-validation.
plot.ithreshpred
for the S3 plot method for objects
of class ithreshpred
.
# Note:
#' In the examples below validation thresholds rather higher than is
# advisable have been used, with far fewer excesses than the minimum of
# 50 suggested by Jonathan and Ewans (2013).
# Gulf of Mexico significant wave heights, default priors.
u_vec_gom <- quantile(gom, probs = seq(0, 0.9, by = 0.05))
gom_cv <- ithresh(data = gom, u_vec = u_vec_gom, n_v = 3)
# Note: gom_cv$npy contains the correct value of npy (it was set in the
# call to ithresh, via attr(gom, "npy").
# If object$npy doesn't exist then the argument npy must be supplied
# in the call to predict().
### Best training threshold based on the lowest validation threshold
# Predictive distribution function
best_p <- predict(gom_cv, n_years = c(100, 1000))
plot(best_p)
# Predictive density function
best_d <- predict(gom_cv, type = "d", n_years = c(100, 1000))
plot(best_d)
# Predictive intervals
best_i <- predict(gom_cv, n_years = c(100, 1000), type = "i", hpd = TRUE,
level = c(95, 99))
plot(best_i, which_int = "both")
#> Warning: argument 1 does not name a graphical parameter
# See which threshold was used
summary(gom_cv)
#> v v quantile best u best u quantile index of u_vec
#> 1 4.6070 80 3.3878 60 13
#> 2 5.1302 85 3.9754 70 15
#> 3 5.8246 90 3.9754 70 15
### All thresholds plus weighted average of inferences over all thresholds
# Predictive distribution function
all_p <- predict(gom_cv, which_u = "all")
plot(all_p)
# Predictive density function
all_d <- predict(gom_cv, which_u = "all", type = "d")
plot(all_d)