| Title: | Functions for Extreme Value Distributions |
|---|---|
| Description: | Extends simulation, distribution, quantile and density functions to univariate and multivariate parametric extreme value distributions, and provides fitting functions which calculate maximum likelihood estimates for univariate and bivariate maxima models, and for univariate and bivariate threshold models. |
| Authors: | Alec Stephenson. Function fbvpot by Chris Ferro. |
| Maintainer: | Alec Stephenson <[email protected]> |
| License: | GPL-3 |
| Version: | 2.3-7.1 |
| Built: | 2026-05-20 08:03:33 UTC |
| Source: | https://github.com/cran/evd |
Calculate or plot the dependence function for
nine parametric bivariate extreme value models.
abvevd(x = 0.5, dep, asy = c(1,1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), rev = FALSE, plot = FALSE, add = FALSE, lty = 1, lwd = 1, col = 1, blty = 3, blwd = 1, xlim = c(0,1), ylim = c(0.5,1), xlab = "t", ylab = "A(t)", ...)abvevd(x = 0.5, dep, asy = c(1,1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), rev = FALSE, plot = FALSE, add = FALSE, lty = 1, lwd = 1, col = 1, blty = 3, blwd = 1, xlim = c(0,1), ylim = c(0.5,1), xlab = "t", ylab = "A(t)", ...)
x |
A vector of values at which the dependence function is
evaluated (ignored if plot or add is |
dep |
Dependence parameter for the logistic, asymmetric logistic, Husler-Reiss, negative logistic and asymmetric negative logistic models. |
asy |
A vector of length two, containing the two asymmetry parameters for the asymmetric logistic and asymmetric negative logistic models. |
alpha, beta
|
Alpha and beta parameters for the bilogistic, negative bilogistic, Coles-Tawn and asymmetric mixed models. |
model |
The specified model; a character string. Must be
either |
rev |
Logical; reverse the dependence function? This is
equivalent to evaluating the function at |
plot |
Logical; if |
add |
Logical; add to an existing plot? The existing plot
should have been created using either |
lty, blty
|
Function and border line types. Set |
lwd, blwd
|
Function an border line widths. |
col |
Line colour. |
xlim, ylim
|
x and y-axis limits. |
xlab, ylab
|
x and y-axis labels. |
... |
Other high-level graphics parameters to be passed to
|
Any bivariate extreme value distribution can be written as
for some function defined on , where
for and
, with the (generalized extreme value) marginal
parameters given by ,
.
If then is defined by
continuity.
is called (by some authors) the dependence
function.
It follows that , and that is
a convex function with for all .
The lower and upper limits of are obtained under complete
dependence and independence respectively.
does not depend on the marginal parameters.
Some authors take B(x) = A(1-x) as the dependence function. If the
argument rev = TRUE, then B(x) is plotted/evaluated.
abvevd calculates or plots the dependence function
for one of nine parametric bivariate extreme value models,
at specified parameter values.
abvnonpar, fbvevd,
rbvevd, amvevd
abvevd(dep = 2.7, model = "hr") abvevd(seq(0,1,0.25), dep = 0.3, asy = c(.7,.9), model = "alog") abvevd(alpha = 0.3, beta = 1.2, model = "negbi", plot = TRUE) bvdata <- rbvevd(100, dep = 0.7, model = "log") M1 <- fitted(fbvevd(bvdata, model = "log")) abvevd(dep = M1["dep"], model = "log", plot = TRUE) abvnonpar(data = bvdata, add = TRUE, lty = 2)abvevd(dep = 2.7, model = "hr") abvevd(seq(0,1,0.25), dep = 0.3, asy = c(.7,.9), model = "alog") abvevd(alpha = 0.3, beta = 1.2, model = "negbi", plot = TRUE) bvdata <- rbvevd(100, dep = 0.7, model = "log") M1 <- fitted(fbvevd(bvdata, model = "log")) abvevd(dep = M1["dep"], model = "log", plot = TRUE) abvnonpar(data = bvdata, add = TRUE, lty = 2)
Calculate or plot non-parametric estimates for the dependence function
of the bivariate extreme value distribution.
abvnonpar(x = 0.5, data, epmar = FALSE, nsloc1 = NULL, nsloc2 = NULL, method = c("cfg", "pickands", "tdo", "pot"), k = nrow(data)/4, convex = FALSE, rev = FALSE, madj = 0, kmar = NULL, plot = FALSE, add = FALSE, lty = 1, lwd = 1, col = 1, blty = 3, blwd = 1, xlim = c(0, 1), ylim = c(0.5, 1), xlab = "t", ylab = "A(t)", ...)abvnonpar(x = 0.5, data, epmar = FALSE, nsloc1 = NULL, nsloc2 = NULL, method = c("cfg", "pickands", "tdo", "pot"), k = nrow(data)/4, convex = FALSE, rev = FALSE, madj = 0, kmar = NULL, plot = FALSE, add = FALSE, lty = 1, lwd = 1, col = 1, blty = 3, blwd = 1, xlim = c(0, 1), ylim = c(0.5, 1), xlab = "t", ylab = "A(t)", ...)
x |
A vector of values at which the dependence function is
evaluated (ignored if plot or add is |
data |
A matrix or data frame with two columns, which may contain missing values. |
epmar |
If |
nsloc1, nsloc2
|
A data frame with the same number of rows as
|
method |
The estimation method (see Details). Typically
either |
k |
An integer parameter for the |
convex |
Logical; take the convex minorant? |
rev |
Logical; reverse the dependence function? This is
equivalent to evaluating the function at |
madj |
Performs marginal adjustments for the |
kmar |
In the rare case that the marginal distributions are known, specifies the GEV parameters to be used instead of maximum likelihood estimates. |
plot |
Logical; if |
add |
Logical; add to an existing plot? The existing plot
should have been created using either |
lty, blty
|
Function and border line types. Set |
lwd, blwd
|
Function and border line widths. |
col |
Line colour. |
xlim, ylim
|
x and y-axis limits. |
xlab, ylab
|
x and y-axis labels. |
... |
Other high-level graphics parameters to be passed to
|
The dependence function of the bivariate
extreme value distribution is defined in abvevd.
Non-parametric estimates are constructed as follows.
Suppose for are
bivariate observations that are passed using the data
argument.
If epmar is FALSE (the default), then
the marginal parameters of the GEV margins are estimated
(under the assumption of independence) and the data is
transformed using
and
for , where
and
are the maximum likelihood estimates for the location, scale
and shape parameters on the first and second margins.
If nsloc1 or nsloc2 are given, the location
parameters may depend on (see fgev).
Two different estimators of the dependence function can be
implemented.
They are defined (on ) as
follows.
method = "cfg" (Caperaa, Fougeres and Genest, 1997)
method = "pickands" (Pickands, 1981)
Two variations on the estimator are
also implemented. If the argument madj = 1, an adjustment
given in Deheuvels (1991) is applied. If the argument
madj = 2, an adjustment given in Hall and Tajvidi (2000)
is applied. These are marginal adjustments; they are only
useful when empirical marginal estimation is used.
Let be any estimator of .
None of the estimators satisfy
for all . An obvious modification is
This modification is always implemented.
Convex estimators can be derived by taking the convex minorant,
which can be achieved by setting convex to TRUE.
abvnonpar calculates or plots a non-parametric estimate of
the dependence function of the bivariate extreme value distribution.
I have been asked to point out that Hall and Tajvidi (2000) suggest putting a constrained smoothing spline on their modified Pickands estimator, but this is not done here.
Caperaa, P. Fougeres, A.-L. and Genest, C. (1997) A non-parametric estimation procedure for bivariate extreme value copulas. Biometrika, 84, 567–577.
Pickands, J. (1981) Multivariate extreme value distributions. Proc. 43rd Sess. Int. Statist. Inst., 49, 859–878.
Deheuvels, P. (1991) On the limiting behaviour of the Pickands estimator for bivariate extreme-value distributions. Statist. Probab. Letters, 12, 429–439.
Hall, P. and Tajvidi, N. (2000) Distribution and dependence-function estimation for bivariate extreme-value distributions. Bernoulli, 6, 835–844.
abvevd, amvnonpar,
bvtcplot, fgev
bvdata <- rbvevd(100, dep = 0.7, model = "log") abvnonpar(seq(0, 1, length = 10), data = bvdata, convex = TRUE) abvnonpar(data = bvdata, method = "pick", plot = TRUE) M1 <- fitted(fbvevd(bvdata, model = "log")) abvevd(dep = M1["dep"], model = "log", plot = TRUE) abvnonpar(data = bvdata, add = TRUE, lty = 2)bvdata <- rbvevd(100, dep = 0.7, model = "log") abvnonpar(seq(0, 1, length = 10), data = bvdata, convex = TRUE) abvnonpar(data = bvdata, method = "pick", plot = TRUE) M1 <- fitted(fbvevd(bvdata, model = "log")) abvevd(dep = M1["dep"], model = "log", plot = TRUE) abvnonpar(data = bvdata, add = TRUE, lty = 2)
Calculate the dependence function for the multivariate
logistic and multivariate asymmetric logistic models; plot the
estimated function in the trivariate case.
amvevd(x = rep(1/d,d), dep, asy, model = c("log", "alog"), d = 3, plot = FALSE, col = heat.colors(12), blty = 0, grid = if(blty) 150 else 50, lower = 1/3, ord = 1:3, lab = as.character(1:3), lcex = 1)amvevd(x = rep(1/d,d), dep, asy, model = c("log", "alog"), d = 3, plot = FALSE, col = heat.colors(12), blty = 0, grid = if(blty) 150 else 50, lower = 1/3, ord = 1:3, lab = as.character(1:3), lcex = 1)
x |
A vector of length |
dep |
The dependence parameter(s). For the logistic model,
should be a single value. For the asymmetric logistic model,
should be a vector of length |
asy |
The asymmetry parameters for the asymmetric logistic
model. Should be a list with |
model |
The specified model; a character string. Must be
either |
d |
The dimension; an integer greater than or equal to two.
The trivariate case |
plot |
Logical; if |
col |
A list of colours (see |
blty |
The border line type, for the border that surrounds
the triangular image. By default |
grid |
For plotting, the function is evaluated at |
lower |
The minimum value for which colours are plotted. By
defualt |
ord |
A vector of length three, which should be a permutation
of the set |
lab |
A character vector of length three, in which case the
|
lcex |
A numerical value giving the amount by which the
labels should be scaled relative to the default. Ignored
if |
Let and
.
Any multivariate extreme value distribution can be written as
for some function defined on the simplex
,
where
for and
, and where the (generalized extreme value)
marginal parameters are given by
, .
If then is defined by
continuity.
is called (by some authors) the dependence function.
It follows that when is one of the
vertices of , and that is a convex function with
for
all in .
The lower and upper limits of are obtained under complete
dependence and mutual independence respectively.
does not depend on the marginal parameters.
A numeric vector of values. If plotting, the smallest evaluated function value is returned invisibly.
amvnonpar, abvevd,
rmvevd, image
amvevd(dep = 0.5, model = "log") s3pts <- matrix(rexp(30), nrow = 10, ncol = 3) s3pts <- s3pts/rowSums(s3pts) amvevd(s3pts, dep = 0.5, model = "log") ## Not run: amvevd(dep = 0.05, model = "log", plot = TRUE, blty = 1) amvevd(dep = 0.95, model = "log", plot = TRUE, lower = 0.94) asy <- list(.4, .1, .6, c(.3,.2), c(.1,.1), c(.4,.1), c(.2,.3,.2)) amvevd(s3pts, dep = 0.15, asy = asy, model = "alog") amvevd(dep = 0.15, asy = asy, model = "al", plot = TRUE, lower = 0.7)amvevd(dep = 0.5, model = "log") s3pts <- matrix(rexp(30), nrow = 10, ncol = 3) s3pts <- s3pts/rowSums(s3pts) amvevd(s3pts, dep = 0.5, model = "log") ## Not run: amvevd(dep = 0.05, model = "log", plot = TRUE, blty = 1) amvevd(dep = 0.95, model = "log", plot = TRUE, lower = 0.94) asy <- list(.4, .1, .6, c(.3,.2), c(.1,.1), c(.4,.1), c(.2,.3,.2)) amvevd(s3pts, dep = 0.15, asy = asy, model = "alog") amvevd(dep = 0.15, asy = asy, model = "al", plot = TRUE, lower = 0.7)
Calculate non-parametric estimates for the dependence function
of the multivariate extreme value distribution and plot
the estimated function in the trivariate case.
amvnonpar(x = rep(1/d,d), data, d = 3, epmar = FALSE, nsloc = NULL, madj = 0, kmar = NULL, plot = FALSE, col = heat.colors(12), blty = 0, grid = if(blty) 150 else 50, lower = 1/3, ord = 1:3, lab = as.character(1:3), lcex = 1)amvnonpar(x = rep(1/d,d), data, d = 3, epmar = FALSE, nsloc = NULL, madj = 0, kmar = NULL, plot = FALSE, col = heat.colors(12), blty = 0, grid = if(blty) 150 else 50, lower = 1/3, ord = 1:3, lab = as.character(1:3), lcex = 1)
x |
A vector of length |
data |
A matrix or data frame with |
d |
The dimension; an integer greater than or equal to two.
The trivariate case |
epmar |
If |
nsloc |
A data frame with the same number of rows as |
madj |
Performs marginal adjustments. See
|
kmar |
In the rare case that the marginal distributions are known, specifies the GEV parameters to be used instead of maximum likelihood estimates. |
plot |
Logical; if |
col |
A list of colours (see |
blty |
The border line type, for the border that surrounds
the triangular image. By default |
grid |
For plotting, the function is evaluated at |
lower |
The minimum value for which colours are plotted. By
default |
ord |
A vector of length three, which should be a permutation
of the set |
lab |
A character vector of length three, in which case the
|
lcex |
A numerical value giving the amount by which the
labels should be scaled relative to the default. Ignored
if |
A numeric vector of estimates. If plotting, the smallest evaluated estimate is returned invisibly.
The rows of data that contain missing values are not used
in the estimation of the dependence structure, but every non-missing
value is used in estimating the margins.
The dependence function of the multivariate extreme value
distribution is defined in amvevd.
The function amvevd calculates and plots dependence
functions of multivariate logistic and multivariate asymmetric
logistic models.
The estimator plotted or calculated is a multivariate extension of
the bivariate Pickands estimator defined in abvnonpar.
s5pts <- matrix(rexp(50), nrow = 10, ncol = 5) s5pts <- s5pts/rowSums(s5pts) sdat <- rmvevd(100, dep = 0.6, model = "log", d = 5) amvnonpar(s5pts, sdat, d = 5) ## Not run: amvnonpar(data = sdat, plot = TRUE) ## Not run: amvnonpar(data = sdat, plot = TRUE, ord = c(2,3,1), lab = LETTERS[1:3]) ## Not run: amvevd(dep = 0.6, model = "log", plot = TRUE) ## Not run: amvevd(dep = 0.6, model = "log", plot = TRUE, blty = 1)s5pts <- matrix(rexp(50), nrow = 10, ncol = 5) s5pts <- s5pts/rowSums(s5pts) sdat <- rmvevd(100, dep = 0.6, model = "log", d = 5) amvnonpar(s5pts, sdat, d = 5) ## Not run: amvnonpar(data = sdat, plot = TRUE) ## Not run: amvnonpar(data = sdat, plot = TRUE, ord = c(2,3,1), lab = LETTERS[1:3]) ## Not run: amvevd(dep = 0.6, model = "log", plot = TRUE) ## Not run: amvevd(dep = 0.6, model = "log", plot = TRUE, blty = 1)
Compute an analysis of deviance table for two or more nested evd objects.
## S3 method for class 'evd' anova(object, object2, ..., half = FALSE)## S3 method for class 'evd' anova(object, object2, ..., half = FALSE)
object |
An object of class |
object2 |
An object of class |
... |
Further successively nested objects. |
half |
For some non-regular tesing problems the deviance
difference is known to be one half of a chi-squared random
variable. Set |
An object of class c("anova", "data.frame"), with one
row for each model, and the following five columns
M.Df |
The number of parameters. |
Deviance |
The deviance. |
Df |
The number of parameters of the model in the previous row minus the number of parameters. |
Chisq |
The deviance minus the deviance of the model
in the previous row (or twice this if |
Pr(>chisq) |
The p-value calculated by comparing the quantile
|
Circumstances may arise such that the asymptotic distribution of the test statistic is not chi-squared. In particular, this occurs when the smaller model is constrained at the edge of the parameter space. It is up to the user recognize this, and to interpret the output correctly.
In some cases the asymptotic distribution is known to be
one half of a chi-squared; you can set half = TRUE in
these cases.
fbvevd, fextreme,
fgev, forder
uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) trend <- (-49:50)/100 M1 <- fgev(uvdata, nsloc = trend) M2 <- fgev(uvdata) M3 <- fgev(uvdata, shape = 0) anova(M1, M2, M3) bvdata <- rbvevd(100, dep = 0.75, model = "log") M1 <- fbvevd(bvdata, model = "log") M2 <- fbvevd(bvdata, model = "log", dep = 0.75) M3 <- fbvevd(bvdata, model = "log", dep = 1) anova(M1, M2) anova(M1, M3, half = TRUE)uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) trend <- (-49:50)/100 M1 <- fgev(uvdata, nsloc = trend) M2 <- fgev(uvdata) M3 <- fgev(uvdata, shape = 0) anova(M1, M2, M3) bvdata <- rbvevd(100, dep = 0.75, model = "log") M1 <- fbvevd(bvdata, model = "log") M2 <- fbvevd(bvdata, model = "log", dep = 0.75) M3 <- fbvevd(bvdata, model = "log", dep = 1) anova(M1, M2) anova(M1, M3, half = TRUE)
Density function, distribution function and random generation for nine parametric bivariate extreme value models.
dbvevd(x, dep, asy = c(1, 1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), mar1 = c(0, 1, 0), mar2 = mar1, log = FALSE) pbvevd(q, dep, asy = c(1, 1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), mar1 = c(0, 1, 0), mar2 = mar1, lower.tail = TRUE) rbvevd(n, dep, asy = c(1, 1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), mar1 = c(0, 1, 0), mar2 = mar1)dbvevd(x, dep, asy = c(1, 1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), mar1 = c(0, 1, 0), mar2 = mar1, log = FALSE) pbvevd(q, dep, asy = c(1, 1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), mar1 = c(0, 1, 0), mar2 = mar1, lower.tail = TRUE) rbvevd(n, dep, asy = c(1, 1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), mar1 = c(0, 1, 0), mar2 = mar1)
x, q
|
A vector of length two or a matrix with two columns, in which case the density/distribution is evaluated across the rows. |
n |
Number of observations. |
dep |
Dependence parameter for the logistic, asymmetric logistic, Husler-Reiss, negative logistic and asymmetric negative logistic models. |
asy |
A vector of length two, containing the two asymmetry parameters for the asymmetric logistic and asymmetric negative logistic models. |
alpha, beta
|
Alpha and beta parameters for the bilogistic, negative bilogistic, Coles-Tawn and asymmetric mixed models. |
model |
The specified model; a character string. Must be
either |
mar1, mar2
|
Vectors of length three containing marginal parameters, or matrices with three columns where each column represents a vector of values to be passed to the corresponding marginal parameter. |
log |
Logical; if |
lower.tail |
Logical; if |
Define
for and
, where the marginal parameters are given by
,
.
If then is defined by
continuity.
In each of the bivariate distributions functions
given below, the univariate margins
are generalized extreme value, so that
for .
If for some
, the value is either greater than the
upper end point (if ), or less than the lower
end point (if ), of the th univariate
marginal distribution.
model = "log" (Gumbel, 1960)
The bivariate logistic distribution function with
parameter is
where .
This is a special case of the bivariate asymmetric logistic
model.
Complete dependence is obtained in the limit as
approaches zero.
Independence is obtained when .
model = "alog" (Tawn, 1988)
The bivariate asymmetric logistic distribution function with
parameters and
is
where and
.
When the asymmetric logistic
model is equivalent to the logistic model.
Independence is obtained when either ,
or .
Complete dependence is obtained in the limit when
and
approaches zero.
Different limits occur when and
are fixed and approaches zero.
model = "hr" (Husler and Reiss, 1989)
The Husler-Reiss distribution function with parameter
is
where is the standard normal distribution
function and .
Independence is obtained in the limit as approaches zero.
Complete dependence is obtained as tends to infinity.
model = "neglog" (Galambos, 1975)
The bivariate negative logistic distribution function
with parameter is
where .
This is a special case of the bivariate asymmetric negative
logistic model.
Independence is obtained in the limit as approaches zero.
Complete dependence is obtained as tends to infinity.
The earliest reference to this model appears to be
Galambos (1975, Section 4).
model = "aneglog" (Joe, 1990)
The bivariate asymmetric negative logistic distribution function
with parameters parameters and
is
where and .
When the asymmetric negative
logistic model is equivalent to the negative logistic model.
Independence is obtained in the limit as either ,
or approaches zero.
Complete dependence is obtained in the limit when
and
tends to infinity.
Different limits occur when and
are fixed and tends to infinity.
The earliest reference to this model appears to be Joe (1990),
who introduces a multivariate extreme value distribution which
reduces to in the bivariate case.
model = "bilog" (Smith, 1990)
The bilogistic distribution function with
parameters
and is
where
is the root of the equation
.
When the bilogistic model
is equivalent to the logistic model with dependence parameter
.
Complete dependence is obtained in the limit as
approaches zero.
Independence is obtained as
approaches one, and when
one of is fixed and the other
approaches one.
Different limits occur when one of
is fixed and the other
approaches zero.
A bilogistic model is fitted in Smith (1990), where it appears
to have been first introduced.
model = "negbilog" (Coles and Tawn, 1994)
The negative bilogistic distribution function with
parameters
and is
where
is the root of the equation
and .
When the negative bilogistic
model is equivalent to the negative logistic model with dependence
parameter
.
Complete dependence is obtained in the limit as
approaches zero.
Independence is obtained as
tends to infinity, and when
one of is fixed and the other
tends to infinity.
Different limits occur when one of
is fixed and the other
approaches zero.
model = "ct" (Coles and Tawn, 1991)
The Coles-Tawn distribution function with
parameters
and is
where
and
is the beta
distribution function evaluated at with
and
.
Complete dependence is obtained in the limit as
tends to infinity.
Independence is obtained as
approaches zero, and when
one of is fixed and the other
approaches zero.
Different limits occur when one of
is fixed and the other
tends to infinity.
model = "amix" (Tawn, 1988)
The asymmetric mixed distribution function with
parameters
and has
a dependence function with the following cubic polynomial
form.
where and
are non-negative, and where
and are less than or equal
to one.
These constraints imply that beta lies in the interval [-0.5,0.5]
and that alpha lies in the interval [0,1.5], though alpha can
only be greater than one if beta is negative. The strength
of dependence increases for increasing alpha (for fixed beta).
Complete dependence cannot be obtained.
Independence is obtained when both parameters are zero.
For the definition of a dependence function, see
abvevd.
dbvevd gives the density function, pbvevd gives the
distribution function and rbvevd generates random deviates,
for one of nine parametric bivariate extreme value models.
The logistic and asymmetric logistic models respectively are simulated using bivariate versions of Algorithms 1.1 and 1.2 in Stephenson(2003). All other models are simulated using a root finding algorithm to simulate from the conditional distributions.
The simulation of the bilogistic and negative bilogistic models
requires a root finding algorithm to evaluate
within the root finding algorithm used to simulate from the
conditional distributions.
The generation of bilogistic and negative bilogistic random
deviates is therefore relatively slow (about 2.8 seconds per
1000 random vectors on a 450MHz PIII, 512Mb RAM).
The bilogistic and negative bilogistic models can be represented under a single model, using the integral of the maximum of two beta distributions (Joe, 1997).
The Coles-Tawn model is called the Dirichelet model in Coles and Tawn (1991).
Coles, S. G. and Tawn, J. A. (1991) Modelling extreme multivariate events. J. Roy. Statist. Soc., B, 53, 377–392.
Coles, S. G. and Tawn, J. A. (1994) Statistical methods for multivariate extremes: an application to structural design (with discussion). Appl. Statist., 43, 1–48.
Galambos, J. (1975) Order statistics of samples from multivariate distributions. J. Amer. Statist. Assoc., 70, 674–680.
Gumbel, E. J. (1960) Distributions des valeurs extremes en plusieurs dimensions. Publ. Inst. Statist. Univ. Paris, 9, 171–173.
Husler, J. and Reiss, R.-D. (1989) Maxima of normal random vectors: between independence and complete dependence. Statist. Probab. Letters, 7, 283–286.
Joe, H. (1990) Families of min-stable multivariate exponential and multivariate extreme value distributions. Statist. Probab. Letters, 9, 75–81.
Joe, H. (1997) Multivariate Models and Dependence Concepts, London: Chapman & Hall.
Smith, R. L. (1990) Extreme value theory. In Handbook of Applicable Mathematics (ed. W. Ledermann), vol. 7. Chichester: John Wiley, pp. 437–471.
Stephenson, A. G. (2003) Simulating multivariate extreme value distributions of logistic type. Extremes, 6(1), 49–60.
Tawn, J. A. (1988) Bivariate extreme value theory: models and estimation. Biometrika, 75, 397–415.
pbvevd(matrix(rep(0:4,2), ncol=2), dep = 0.7, model = "log") pbvevd(c(2,2), dep = 0.7, asy = c(0.6,0.8), model = "alog") pbvevd(c(1,1), dep = 1.7, model = "hr") margins <- cbind(0, 1, seq(-0.5,0.5,0.1)) rbvevd(11, dep = 1.7, model = "hr", mar1 = margins) rbvevd(10, dep = 1.2, model = "neglog", mar1 = c(10, 1, 1)) rbvevd(10, alpha = 0.7, beta = 0.52, model = "bilog") dbvevd(c(0,0), dep = 1.2, asy = c(0.5,0.9), model = "aneglog") dbvevd(c(0,0), alpha = 0.75, beta = 0.5, model = "ct", log = TRUE) dbvevd(c(0,0), alpha = 0.7, beta = 1.52, model = "negbilog")pbvevd(matrix(rep(0:4,2), ncol=2), dep = 0.7, model = "log") pbvevd(c(2,2), dep = 0.7, asy = c(0.6,0.8), model = "alog") pbvevd(c(1,1), dep = 1.7, model = "hr") margins <- cbind(0, 1, seq(-0.5,0.5,0.1)) rbvevd(11, dep = 1.7, model = "hr", mar1 = margins) rbvevd(10, dep = 1.2, model = "neglog", mar1 = c(10, 1, 1)) rbvevd(10, alpha = 0.7, beta = 0.52, model = "bilog") dbvevd(c(0,0), dep = 1.2, asy = c(0.5,0.9), model = "aneglog") dbvevd(c(0,0), alpha = 0.75, beta = 0.5, model = "ct", log = TRUE) dbvevd(c(0,0), alpha = 0.7, beta = 1.52, model = "negbilog")
Produces a diagnostic plot to assist with threshold choice for bivariate data.
bvtcplot(x, spectral = FALSE, xlab, ylab, ...)bvtcplot(x, spectral = FALSE, xlab, ylab, ...)
x |
A matrix or data frame, ordinarily with two columns, which may contain missing values. |
spectral |
If |
ylab, xlab
|
Graphics parameters. |
... |
Other arguments to be passed to the plotting function. |
If spectral is FALSE (the default), produces a threshold
choice plot as illustrated in Beirlant et al. (2004). With
non-missing bivariate observations, the integers
are plotted against the values
, where
is the th order statistic of the sum of the margins
following empirical transformation to standard Frechet.
A vertical line is drawn at k0, where k0 is the largest
integer for which the y-axis is above the value two. If spectral
is FALSE, the largest k0 data points are used to plot an
estimate of the spectal measure versus .
A list is invisibly returned giving k0 and the values used to
produce the plot.
Beirlant, J., Goegebeur, Y., Segers, J. and Teugels, J. L. (2004) Statistics of Extremes: Theory and Applications., Chichester, England: John Wiley and Sons.
## Not run: bvtcplot(lossalae) ## Not run: bvtcplot(lossalae, spectral = TRUE)## Not run: bvtcplot(lossalae) ## Not run: bvtcplot(lossalae, spectral = TRUE)
Conditional copula functions, conditioning on either margin, for nine parametric bivariate extreme value models.
ccbvevd(x, mar = 2, dep, asy = c(1, 1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), lower.tail = TRUE)ccbvevd(x, mar = 2, dep, asy = c(1, 1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), lower.tail = TRUE)
x |
A matrix or data frame, ordinarily with two columns,
which may contain missing values. A data frame may also
contain a third column of mode |
mar |
One or two; conditions on this margin. |
dep |
Dependence parameter for the logistic, asymmetric logistic, Husler-Reiss, negative logistic and asymmetric negative logistic models. |
asy |
A vector of length two, containing the two asymmetry parameters for the asymmetric logistic and asymmetric negative logistic models. |
alpha, beta
|
Alpha and beta parameters for the bilogistic, negative bilogistic, Coles-Tawn and asymmetric mixed models. |
model |
The specified model; a character string. Must be
either |
lower.tail |
Logical; if |
The function calculates , where is a random
vector with Uniform(0,1) margins and with a dependence structure
given by the specified parametric model. By default, the values
of and are given by the first and second
columns of the argument x. If mar = 1 then this is
reversed.
If x has a third column of mode logical, then
the function returns , according to inference proceedures derived
by Stephenson and Tawn (2004).
See fbvevd. This requires numerical integration,
and hence will be slower.
This function is mainly for internal use. It is used by
plot.bvevd to calculate the conditional P-P
plotting diagnostics.
A numeric vector of probabilities.
Stephenson, A. G. and Tawn, J. A. (2004) Exploiting Occurence Times in Likelihood Inference for Componentwise Maxima. Biometrika 92(1), 213–217.
Plots of estimates of the dependence measures chi and chi-bar for bivariate data.
chiplot(data, nq = 100, qlim = NULL, which = 1:2, conf = 0.95, trunc = TRUE, spcases = FALSE, lty = 1, cilty = 2, col = 1, cicol = 1, xlim = c(0,1), ylim1 = c(-1,1), ylim2 = c(-1,1), main1 = "Chi Plot", main2 = "Chi Bar Plot", xlab = "Quantile", ylab1 = "Chi", ylab2 = "Chi Bar", ask = nb.fig < length(which) && dev.interactive(), ...)chiplot(data, nq = 100, qlim = NULL, which = 1:2, conf = 0.95, trunc = TRUE, spcases = FALSE, lty = 1, cilty = 2, col = 1, cicol = 1, xlim = c(0,1), ylim1 = c(-1,1), ylim2 = c(-1,1), main1 = "Chi Plot", main2 = "Chi Bar Plot", xlab = "Quantile", ylab1 = "Chi", ylab2 = "Chi Bar", ask = nb.fig < length(which) && dev.interactive(), ...)
data |
A matrix or data frame with two columns. Rows (observations) with missing values are stripped from the data before any computations are performed. |
nq |
The number of quantiles at which the measures are evaluated. |
qlim |
The limits of the quantiles at which the measures are evaluated (see Details). |
which |
If only one plot is required, specify |
conf |
The confidence coefficient of the plotted confidence intervals. |
trunc |
Logical; truncate the estimates at their theoretical upper and lower bounds? |
spcases |
If |
lty, cilty
|
Line types for the estimates of the measures and for the confidence intervals respectively. Use zero to supress. |
col, cicol
|
Colour types for the estimates of the measures and for the confidence intervals respectively. |
xlim, xlab
|
Limits and labels for the x-axis; they apply to both plots. |
ylim1 |
Limits for the y-axis of the chi plot. If this
is |
ylim2 |
Limits for the y-axis of the chi-bar plot. |
main1, main2
|
The plot titles for the chi and chi-bar plots respectively. |
ylab1, ylab2
|
The y-axis labels for the chi and chi-bar plots respectively. |
ask |
Logical; if |
... |
Other arguments to be passed to |
These measures are explained in full detail in Coles, Heffernan
and Tawn (1999). A brief treatment is also given in Section
8.4 of Coles(2001).
A short summary is given as follows.
We assume that the data are iid random vectors with common
bivariate distribution function , and we define the random
vector to be distributed according to .
The chi plot is a plot of against empirical estimates of
where and are the marginal distribution
functions, and where is in the interval (0,1).
The quantity is bounded by
where the lower bound is interpreted as -Inf for
and zero for .
These bounds are reflected in the corresponding estimates.
The chi bar plot is a plot of against empirical estimates of
where and are the marginal distribution
functions, and where is in the interval (0,1).
The quantity is bounded by
and these bounds are reflected in the corresponding estimates.
Note that the empirical estimators for and
are undefined near and . By
default the function takes the limits of so that the plots
depicts all values at which the estimators are defined. This can be
overridden by the argument qlim, which must represent a subset
of the default values (and these can be determined using the
component quantile of the invisibly returned list; see
Value).
The confidence intervals within the plot assume that observations are
independent, and that the marginal distributions are estimated exactly.
The intervals are constructed using the delta method; this may
lead to poor interval estimates near and .
The function can be interpreted as a quantile
dependent measure of dependence. In particular, the sign of
determines whether the variables are positively
or negatively associated at quantile level .
By definition, variables are said to be asymptotically independent
when (defined in the limit) is zero.
For independent variables, for all
in (0,1).
For perfectly dependent variables,
for all in (0,1).
For bivariate extreme value distributions,
for all in (0,1), where is the dependence function,
as defined in abvevd. If a bivariate threshold model
is to be fitted (using fbvpot), this plot can therefore
act as a threshold identification plot, since e.g. the use of 95%
marginal quantiles as threshold values implies that
should be approximately constant above .
The function can again be interpreted
as a quantile dependent measure of dependence; it is most useful
within the class of asymptotically independent variables.
For asymptotically dependent variables (i.e. those for which
), we have , where
is again defined in the limit.
For asymptotically independent variables, provides a measure that increases with dependence strength.
For independent variables for
all in (0,1), and hence .
A list with components quantile, chi (if 1 is in
which) and chibar (if 2 is in which)
is invisibly returned.
The components quantile and chi contain those objects
that were passed to the formal arguments x and y of
matplot in order to create the chi plot.
The components quantile and chibar contain those objects
that were passed to the formal arguments x and y of
matplot in order to create the chi-bar plot.
Jan Heffernan and Alec Stephenson
Coles, S. G., Heffernan, J. and Tawn, J. A. (1999) Dependence measures for extreme value analyses. Extremes, 2, 339–365.
Coles, S. G. (2001) An Introduction to Statistical Modelling of Extreme Values, London: Springer–Verlag.
par(mfrow = c(1,2)) smdat1 <- rbvevd(1000, dep = 0.6, model = "log") smdat2 <- rbvevd(1000, dep = 1, model = "log") chiplot(smdat1) chiplot(smdat2)par(mfrow = c(1,2)) smdat1 <- rbvevd(1000, dep = 0.6, model = "log") smdat2 <- rbvevd(1000, dep = 1, model = "log") chiplot(smdat1) chiplot(smdat2)
Identify clusters of exceedences.
clusters(data, u, r = 1, ulow = -Inf, rlow = 1, cmax = FALSE, keep.names = TRUE, plot = FALSE, xdata = seq(along = data), lvals = TRUE, lty = 1, lwd = 1, pch = par("pch"), col = if(n > 250) NULL else "grey", xlab = "Index", ylab = "Data", ...)clusters(data, u, r = 1, ulow = -Inf, rlow = 1, cmax = FALSE, keep.names = TRUE, plot = FALSE, xdata = seq(along = data), lvals = TRUE, lty = 1, lwd = 1, pch = par("pch"), col = if(n > 250) NULL else "grey", xlab = "Index", ylab = "Data", ...)
data |
A numeric vector, which may contain missing values. |
u |
A single value giving the threshold, unless a time varying
threshold is used, in which case |
r |
A postive integer denoting the clustering interval length. By default the interval length is one. |
ulow |
A single value giving the lower threshold, unless a time
varying lower threshold is used, in which case |
rlow |
A postive integer denoting the lower clustering interval length. By default the interval length is one. |
cmax |
Logical; if |
keep.names |
Logical; if |
plot |
Logical; if |
xdata |
A numeric vector with the same length as |
lvals |
Logical; should the values below the threshold and the line depicting the lower threshold be plotted? |
lty, lwd
|
Line type and width for the lines depicting the threshold and the lower threshold. |
pch |
Plotting character. |
col |
Strips of colour |
xlab, ylab
|
Labels for the x and y axis. |
... |
Other graphics parameters. |
The clusters of exceedences are identified as follows.
The first exceedence of the threshold initiates the first cluster.
The first cluster then remains active until either r
consecutive values fall below (or are equal to) the threshold,
or until rlow consecutive values fall below (or are equal
to) the lower threshold.
The next exceedence of the threshold (if it exists) then initiates
the second cluster, and so on.
Missing values are allowed, in which case they are treated as
falling below (or equal to) the threshold, but falling above the
lower threshold.
If cmax is FALSE (the default), a list with one
component for each identified cluster.
If cmax is TRUE, a numeric vector containing the
cluster maxima.
In any case, the returned object has an attribute acs,
giving the average cluster size (where the cluster size is
defined as the number of exceedences within a cluster), which
will be NaN if there are no values above the threshold
(and hence no clusters).
If plot is TRUE, the list of clusters, or vector
of cluster maxima, is returned invisibly.
clusters(portpirie, 4.2, 3) clusters(portpirie, 4.2, 3, cmax = TRUE) clusters(portpirie, 4.2, 3, 3.8, plot = TRUE) clusters(portpirie, 4.2, 3, 3.8, plot = TRUE, lvals = FALSE) tvu <- c(rep(4.2, 20), rep(4.1, 25), rep(4.2, 20)) clusters(portpirie, tvu, 3, plot = TRUE)clusters(portpirie, 4.2, 3) clusters(portpirie, 4.2, 3, cmax = TRUE) clusters(portpirie, 4.2, 3, 3.8, plot = TRUE) clusters(portpirie, 4.2, 3, 3.8, plot = TRUE, lvals = FALSE) tvu <- c(rep(4.2, 20), rep(4.1, 25), rep(4.2, 20)) clusters(portpirie, tvu, 3, plot = TRUE)
Calculate profile and Wald confidence intervals of parameters in fitted models.
## S3 method for class 'evd' confint(object, parm, level = 0.95, ...) ## S3 method for class 'profile.evd' confint(object, parm, level = 0.95, ...)## S3 method for class 'evd' confint(object, parm, level = 0.95, ...) ## S3 method for class 'profile.evd' confint(object, parm, level = 0.95, ...)
object |
Either a fitted model object (of class |
parm |
A character vector of parameters; a confidence interval is calculated for each parameter. If missing, then intervals are returned for all parameters in the fitted model or profile trace. |
level |
A single number giving the confidence level. |
... |
Not used. |
A matrix with two columns giving lower and upper confidence limits.
For profile confidence intervals, this function assumes that the profile trace is unimodal. If the profile trace is not unimodal then the function will give spurious results.
m1 <- fgev(portpirie) confint(m1) ## Not run: pm1 <- profile(m1) ## Not run: plot(pm1) ## Not run: confint(pm1)m1 <- fgev(portpirie) confint(m1) ## Not run: pm1 <- profile(m1) ## Not run: plot(pm1) ## Not run: confint(pm1)
Perform score and likelihood ratio tests of independence for bivariate data, assuming a logistic dependence model as the alternative.
evind.test(x, method = c("ratio", "score"), verbose = FALSE)evind.test(x, method = c("ratio", "score"), verbose = FALSE)
x |
A matrix or data frame, ordinarily with two columns, which may contain missing values. |
method |
The test methodology; either |
verbose |
If |
This simple function fits a stationary bivariate logistic model to the
data and performs a hypothesis test of versus
using the methodology in Tawn (1988). The null
distributions for the printed test statistics are chi-squared on one
df for the likelihood ratio test, and standard normal for the score
test.
An object of class "htest".
Tawn, J. A. (1988) Bivariate extreme value theory: models and estimation. Biometrika, 75, 397–415.
evind.test(sealevel) evind.test(sealevel, method = "score")evind.test(sealevel) evind.test(sealevel, method = "score")
Simulation of first order Markov chains, such that each pair of consecutive values has the dependence structure of one of nine parametric bivariate extreme value distributions.
evmc(n, dep, asy = c(1,1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), margins = c("uniform","rweibull","frechet","gumbel"))evmc(n, dep, asy = c(1,1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), margins = c("uniform","rweibull","frechet","gumbel"))
n |
Number of observations. |
dep |
Dependence parameter for the logistic, asymmetric logistic, Husler-Reiss, negative logistic and asymmetric negative logistic models. |
asy |
A vector of length two, containing the two asymmetry parameters for the asymmetric logistic and asymmetric negative logistic models. |
alpha, beta
|
Alpha and beta parameters for the bilogistic, negative bilogistic, Coles-Tawn and asymmetric mixed models. |
model |
The specified model; a character string. Must be
either |
margins |
The marginal distribution of each value; a
character string. Must be either |
A numeric vector of length n.
evmc(100, alpha = 0.1, beta = 0.1, model = "bilog") evmc(100, dep = 10, model = "hr", margins = "gum")evmc(100, alpha = 0.1, beta = 0.1, model = "bilog") evmc(100, dep = 10, model = "hr", margins = "gum")
Estimates of the extremal index.
exi(data, u, r = 1, ulow = -Inf, rlow = 1)exi(data, u, r = 1, ulow = -Inf, rlow = 1)
data |
A numeric vector, which may contain missing values. |
u |
A single value giving the threshold, unless a time varying
threshold is used, in which case |
r |
Either a postive integer denoting the clustering interval length, or zero, in which case the intervals estimator of Ferro and Segers (2003) is used and following arguments are ignored. By default the interval length is one. |
ulow |
A single value giving the lower threshold, unless a time
varying lower threshold is used, in which case |
rlow |
A postive integer denoting the lower clustering interval length. By default the interval length is one. |
If r is a positive integer the extremal index is estimated
using the inverse of the average cluster size, using the clusters
of exceedences derived from clusters. If r is
zero, an estimate based on inter-exceedance times is used (Ferro
and Segers, 2003).
If there are no exceedances of the threshold, the estimate is
NaN. If there is only one exceedance, the estimate is
one.
A single value estimating the extremal index.
Ferro, C. A. T. and Segers, J. (2003) Inference for clusters of extreme values. JRSS B, 65, 545–556.
exi(portpirie, 4.2, r = 3, ulow = 3.8) tvu <- c(rep(4.2, 20), rep(4.1, 25), rep(4.2, 20)) exi(portpirie, tvu, r = 1) exi(portpirie, tvu, r = 0)exi(portpirie, 4.2, r = 3, ulow = 3.8) tvu <- c(rep(4.2, 20), rep(4.1, 25), rep(4.2, 20)) exi(portpirie, tvu, r = 1) exi(portpirie, tvu, r = 0)
Plots estimates of the extremal index.
exiplot(data, tlim, r = 1, ulow = -Inf, rlow = 1, add = FALSE, nt = 100, lty = 1, xlab = "Threshold", ylab = "Ext. Index", ylim = c(0,1), ...)exiplot(data, tlim, r = 1, ulow = -Inf, rlow = 1, add = FALSE, nt = 100, lty = 1, xlab = "Threshold", ylab = "Ext. Index", ylim = c(0,1), ...)
data |
A numeric vector, which may contain missing values. |
tlim |
A numeric vector of length two, giving the limits for the (time invariant) thresholds at which the estimates are evaluated. |
r, ulow, rlow
|
The estimation method. See |
add |
Add to an existing plot? |
nt |
The number of thresholds at which the estimates are evaluated. |
lty |
Line type. |
xlab, ylab
|
x and y axis labels. |
ylim |
y axis limits. |
... |
Other arguments passed to |
The estimates are calculated using the function exi.
A list with components x and y is invisibly returned.
The first component contains the thresholds, the second contains the
estimates.
sdat <- mar(100, psi = 0.5) tlim <- quantile(sdat, probs = c(0.4,0.9)) exiplot(sdat, tlim) exiplot(sdat, tlim, r = 4, add = TRUE, lty = 2) exiplot(sdat, tlim, r = 0, add = TRUE, lty = 4)sdat <- mar(100, psi = 0.5) tlim <- quantile(sdat, probs = c(0.4,0.9)) exiplot(sdat, tlim) exiplot(sdat, tlim, r = 4, add = TRUE, lty = 2) exiplot(sdat, tlim, r = 0, add = TRUE, lty = 4)
Density function, distribution function, quantile function and random generation for the maximum/minimum of a given number of independent variables from a specified distribution.
dextreme(x, densfun, distnfun, ..., distn, mlen = 1, largest = TRUE, log = FALSE) pextreme(q, distnfun, ..., distn, mlen = 1, largest = TRUE, lower.tail = TRUE) qextreme(p, quantfun, ..., distn, mlen = 1, largest = TRUE, lower.tail = TRUE) rextreme(n, quantfun, ..., distn, mlen = 1, largest = TRUE)dextreme(x, densfun, distnfun, ..., distn, mlen = 1, largest = TRUE, log = FALSE) pextreme(q, distnfun, ..., distn, mlen = 1, largest = TRUE, lower.tail = TRUE) qextreme(p, quantfun, ..., distn, mlen = 1, largest = TRUE, lower.tail = TRUE) rextreme(n, quantfun, ..., distn, mlen = 1, largest = TRUE)
x, q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
Number of observations. |
densfun, distnfun, quantfun
|
Density, distribution and
quantile function of the specified distribution. The density
function must have a |
... |
Parameters of the specified distribution. |
distn |
A character string, optionally given as an
alternative to |
mlen |
The number of independent variables. |
largest |
Logical; if |
log |
Logical; if |
lower.tail |
Logical; if |
dextreme gives the density function, pextreme gives the
distribution function and qextreme gives the quantile function
of the maximum/minimum of mlen independent variables from
a specified distibution. rextreme generates random deviates.
dextreme(2:4, dnorm, pnorm, mean = 0.5, sd = 1.2, mlen = 5) dextreme(2:4, distn = "norm", mean = 0.5, sd = 1.2, mlen = 5) dextreme(2:4, distn = "exp", mlen = 2, largest = FALSE) pextreme(2:4, distn = "exp", rate = 1.2, mlen = 2) qextreme(seq(0.9, 0.6, -0.1), distn = "exp", rate = 1.2, mlen = 2) rextreme(5, qgamma, shape = 1, mlen = 10) p <- (1:9)/10 pexp(qextreme(p, distn = "exp", rate = 1.2, mlen = 1), rate = 1.2) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9dextreme(2:4, dnorm, pnorm, mean = 0.5, sd = 1.2, mlen = 5) dextreme(2:4, distn = "norm", mean = 0.5, sd = 1.2, mlen = 5) dextreme(2:4, distn = "exp", mlen = 2, largest = FALSE) pextreme(2:4, distn = "exp", rate = 1.2, mlen = 2) qextreme(seq(0.9, 0.6, -0.1), distn = "exp", rate = 1.2, mlen = 2) rextreme(5, qgamma, shape = 1, mlen = 10) p <- (1:9)/10 pexp(qextreme(p, distn = "exp", rate = 1.2, mlen = 1), rate = 1.2) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Failure times.
failurefailure
A vector containing 24 observations.
van Montfort, M. A. J. and Otten, A. (1978) On testing a shape parameter in the presence of a scale parameter. Math. Operations Forsch. Statist., Ser. Statistics, 9, 91–104.
Fit models for one of nine parametric bivariate extreme value distributions, including linear modelling of the marginal location parameters, and allowing any of the parameters to be held fixed if desired.
fbvevd(x, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), start, ..., sym = FALSE, nsloc1 = NULL, nsloc2 = NULL, cshape = cscale, cscale = cloc, cloc = FALSE, std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)fbvevd(x, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), start, ..., sym = FALSE, nsloc1 = NULL, nsloc2 = NULL, cshape = cscale, cscale = cloc, cloc = FALSE, std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)
x |
A matrix or data frame, ordinarily with two columns,
which may contain missing values. A data frame may also
contain a third column of mode |
model |
The specified model; a character string. Must be
either |
start |
A named list giving the initial values for the
parameters over which the likelihood is to be maximized.
If |
... |
Additional parameters, either for the bivariate extreme
value model or for the optimization function |
sym |
Logical; if |
nsloc1, nsloc2
|
A data frame with the same number of rows as
|
cshape |
Logical; if |
cscale |
Logical; if |
cloc |
Logical; if |
std.err |
Logical; if |
corr |
Logical; if |
method |
The optimization method (see |
warn.inf |
Logical; if |
The dependence parameter names are one or more of dep,
asy1, asy2, alpha and beta, depending on
the model selected (see rbvevd). The marginal parameter
names are loc1, scale1 and shape1 for the first
margin, and loc2, scale2 and shape2 for the
second margin.
If nsloc1 is not NULL, so that a linear model is
implemented for the first marginal location parameter, the parameter
names for the first margin are loc1, loc1x1,
..., loc1xn, scale and shape, where
x1, ..., xn are the column names of nsloc1,
so that loc1 is the intercept of the linear model, and
loc1x1, ..., loc1xn are the
ncol(nsloc1) coefficients.
When nsloc2 is not NULL, the parameter names for the
second margin are constructed similarly.
It is recommended that the covariates within the linear models for
the location parameters are (at least approximately) centered and
scaled (i.e. that the columns of nsloc1 and nsloc2
are centered and scaled), particularly if automatic starting values
are used, since the starting values for the associated parameters are
then zero. If cloc is TRUE, both nsloc1 and
nsloc2 must be identical, since a common linear model is
then implemented on both margins.
If cshape is true, the models are constrained so that
shape2 = shape1. The parameter shape2 is then
taken to be specified, so that e.g. the common shape
parameter can only be fixed at zero using shape1 = 0,
since using shape2 = 0 gives an error. Similar
comments apply for cscale and cloc.
If sym is TRUE, the asymmetric logistic and
asymmetric negative logistic models are constrained so that
asy2 = asy1, and the Coles-Tawn model is constrained
so that beta = alpha. The parameter asy2 or
beta is then taken to be specified, so that e.g.
the parameters asy1 and asy2 can only
be fixed at 0.8 using asy1 = 0.8, since
using asy2 = 0.8 gives an error.
Bilogistic and negative bilogistic models constrained to symmetry are logistic and negative logistic models respectively. The (symmetric) mixed model (e.g. Tawn, 1998) can be obtained as a special case of the asymmetric logistic or asymmetric mixed models (see Examples).
The value Dependence given in the printed output
is , where is the estimated dependence
function (see abvevd). It measures the strength of
dependence, and lies in the interval [0,1]; at independence and
complete dependence it is zero and one respectively (Coles,
Heffernan and Tawn, 1999). See chiplot for
further information.
Returns an object of class c("bvevd","evd").
The generic accessor functions fitted (or
fitted.values), std.errors,
deviance, logLik and
AIC extract various features of the
returned object.
The functions profile and profile2d can be
used to obtain deviance profiles.
The function anova compares nested models, and the
function AIC compares non-nested models.
The function plot produces diagnostic plots.
An object of class c("bvevd","evd") is a list containing
the following components
estimate |
A vector containing the maximum likelihood estimates. |
std.err |
A vector containing the standard errors. |
fixed |
A vector containing the parameters that have been fixed at specific values within the optimization. |
fixed2 |
A vector containing the parameters that have been set to be equal to other model parameters. |
param |
A vector containing all parameters (those optimized, those fixed to specific values, and those set to be equal to other model parameters). |
deviance |
The deviance at the maximum likelihood estimates. |
dep.summary |
The estimate of |
corr |
The correlation matrix. |
var.cov |
The variance covariance matrix. |
convergence, counts, message
|
Components taken from the
list returned by |
data |
The data passed to the argument |
tdata |
The data, transformed to stationarity (for non-stationary models). |
nsloc1, nsloc2
|
The arguments |
n |
The number of rows in |
sym |
The argument |
cmar |
The vector |
model |
The argument |
call |
The call of the current function. |
If x is a data frame with a third column of mode
logical, then the model is fitted using the likelihood
derived by Stephenson and Tawn (2004). This is appropriate
when each bivariate data point comprises componentwise maxima
from some underlying bivariate process, and where the
corresponding logical value denotes whether or not the maxima
were caused by the same event within that process.
Under this scheme the diagnostic plots that are produced
using plot are somewhat different to those described
in plot.bvevd: the density, dependence function
and quantile curves plots contain fitted functions for
observations where the logical case is unknown, and the
conditional P-P plots condition on both the logical case and
the given margin (which requires numerical integration at each
data point).
For numerical reasons parameters are subject to artificial
constraints. Specifically, these constraints are: marginal
scale parameters not less than 0.01; dep not less
than [0.1] [0.2] [0.05] in [logistic] [Husler-Reiss]
[negative logistic] models; dep not greater
than [10] [5] in [Husler-Reiss] [negative logistic] models;
asy1 and asy2 not less than 0.001;
alpha and beta not less than [0.1] [0.1]
[0.001] in [bilogistic] [negative bilogistic] [Coles-Tawn]
models; alpha and beta not greater than [0.999]
[20] [30] in [bilogistic] [negative bilogistic] [Coles-Tawn]
models.
The standard errors and the correlation matrix in the returned
object are taken from the observed information, calculated by a
numerical approximation.
They must be interpreted with caution when either of the
marginal shape parameters are less than , because
the usual asymptotic properties of maximum likelihood estimators
do not then hold (Smith, 1985).
Coles, S. G., Heffernan, J. and Tawn, J. A. (1999) Dependence measures for extreme value analyses. Extremes, 2, 339–365.
Smith, R. L. (1985) Maximum likelihood estimation in a class of non-regular cases. Biometrika, 72, 67–90.
Stephenson, A. G. and Tawn, J. A. (2004) Exploiting Occurence Times in Likelihood Inference for Componentwise Maxima. Biometrika 92(1), 213–217.
Tawn, J. A. (1988) Bivariate extreme value theory: models and estimation. Biometrika, 75, 397–415.
anova.evd, optim,
plot.bvevd, profile.evd,
profile2d.evd, rbvevd
bvdata <- rbvevd(100, dep = 0.6, model = "log", mar1 = c(1.2,1.4,0.4)) M1 <- fbvevd(bvdata, model = "log") M2 <- fbvevd(bvdata, model = "log", dep = 0.75) anova(M1, M2) par(mfrow = c(2,2)) plot(M1) plot(M1, mar = 1) plot(M1, mar = 2) ## Not run: par(mfrow = c(1,1)) ## Not run: M1P <- profile(M1, which = "dep") ## Not run: plot(M1P) trend <- (-49:50)/100 rnd <- runif(100, min = -.5, max = .5) fbvevd(bvdata, model = "log", nsloc1 = trend) fbvevd(bvdata, model = "log", nsloc1 = trend, nsloc2 = data.frame(trend = trend, random = rnd)) fbvevd(bvdata, model = "log", nsloc1 = trend, nsloc2 = data.frame(trend = trend, random = rnd), loc2random = 0) bvdata <- rbvevd(100, dep = 1, asy = c(0.5,0.5), model = "anegl") anlog <- fbvevd(bvdata, model = "anegl") mixed <- fbvevd(bvdata, model = "anegl", dep = 1, sym = TRUE) anova(anlog, mixed) amixed <- fbvevd(bvdata, model = "amix") mixed <- fbvevd(bvdata, model = "amix", beta = 0) anova(amixed, mixed)bvdata <- rbvevd(100, dep = 0.6, model = "log", mar1 = c(1.2,1.4,0.4)) M1 <- fbvevd(bvdata, model = "log") M2 <- fbvevd(bvdata, model = "log", dep = 0.75) anova(M1, M2) par(mfrow = c(2,2)) plot(M1) plot(M1, mar = 1) plot(M1, mar = 2) ## Not run: par(mfrow = c(1,1)) ## Not run: M1P <- profile(M1, which = "dep") ## Not run: plot(M1P) trend <- (-49:50)/100 rnd <- runif(100, min = -.5, max = .5) fbvevd(bvdata, model = "log", nsloc1 = trend) fbvevd(bvdata, model = "log", nsloc1 = trend, nsloc2 = data.frame(trend = trend, random = rnd)) fbvevd(bvdata, model = "log", nsloc1 = trend, nsloc2 = data.frame(trend = trend, random = rnd), loc2random = 0) bvdata <- rbvevd(100, dep = 1, asy = c(0.5,0.5), model = "anegl") anlog <- fbvevd(bvdata, model = "anegl") mixed <- fbvevd(bvdata, model = "anegl", dep = 1, sym = TRUE) anova(anlog, mixed) amixed <- fbvevd(bvdata, model = "amix") mixed <- fbvevd(bvdata, model = "amix", beta = 0) anova(amixed, mixed)
Fit models for one of nine parametric bivariate extreme-value distributions using threshold exceedances, allowing any of the parameters to be held fixed if desired.
fbvpot(x, threshold, model = c("log", "bilog", "alog", "neglog", "negbilog", "aneglog", "ct", "hr", "amix"), likelihood = c("censored", "poisson"), start, ..., sym = FALSE, cshape = cscale, cscale = FALSE, std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)fbvpot(x, threshold, model = c("log", "bilog", "alog", "neglog", "negbilog", "aneglog", "ct", "hr", "amix"), likelihood = c("censored", "poisson"), start, ..., sym = FALSE, cshape = cscale, cscale = FALSE, std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)
x |
A matrix or data frame with two columns. If this contains missing values, those values are treated as if they fell below the corresponding marginal threshold. |
threshold |
A vector of two thresholds. |
model |
The specified model; a character string. Must be
either |
likelihood |
The likelihood model; either |
start |
A named list giving the initial values for all of the
parameters in the model. If |
... |
Additional parameters, either for the bivariate extreme
value model or for the optimization function |
sym |
Logical; if |
cshape |
Logical; if |
cscale |
Logical; if |
std.err |
Logical; if |
corr |
Logical; if |
method |
The optimization method (see |
warn.inf |
Logical; if |
For the "censored" method bivariate peaks over threshold models
are fitted by maximizing the censored likelihood as given in e.g. Section
8.3.1 of Coles(2001). For the "poisson" method models are fitted
using Equation 5.4 of Coles and Tawn (1991), see also Joe, Smith and
Weissman (1992). This method is only available for models whose spectral
measure does not contain point masses (see hbvevd). It is not
recommended as in practice it can produce poor estimates.
For either likelihood the margins are modelled using a generalized Pareto
distribution for points above the threshold and an empirical model for
those below. For the "poisson" method data lying below both thresholds
is not used. For the "censored" method the number of points lying
below both thresholds is used, but the locations of the those points are
not.
The dependence parameter names are one or more of dep,
asy1, asy2, alpha and beta, depending on
the model selected (see rbvevd).
The marginal parameter names are scale1 and shape1
for the first margin, and scale2 and shape2 for the
second margin.
If cshape is true, the models are constrained so that
shape2 = shape1. The parameter shape2 is then
taken to be specified, so that e.g. the common shape
parameter can only be fixed at zero using shape1 = 0,
since using shape2 = 0 gives an error. Similar
comments apply for cscale.
If sym is TRUE, the asymmetric logistic and
asymmetric negative logistic models are constrained so that
asy2 = asy1, and the Coles-Tawn model is constrained
so that beta = alpha. The parameter asy2 or
beta is then taken to be specified, so that e.g.
the parameters asy1 and asy2 can only
be fixed at 0.8 using asy1 = 0.8, since
using asy2 = 0.8 gives an error.
Bilogistic and negative bilogistic models constrained to symmetry are logistic and negative logistic models respectively. The (symmetric) mixed model (e.g. Tawn, 1998) can be obtained as a special case of the asymmetric logistic or asymmetric mixed models (see fbvevd).
For numerical reasons the parameters of each model are subject the
artificial constraints given in fbvevd.
Returns an object of class c("bvpot","evd").
The generic accessor functions fitted (or
fitted.values), std.errors,
deviance, logLik and
AIC extract various features of the
returned object.
The functions profile and profile2d can be
used to obtain deviance profiles.
The function anova compares nested models, and the
function AIC compares non-nested models.
There is currently no plot method available.
An object of class c("bvpot","evd") is a list containing
the following components
estimate |
A vector containing the maximum likelihood estimates. |
std.err |
A vector containing the standard errors. |
fixed |
A vector containing the parameters that have been fixed at specific values within the optimization. |
fixed2 |
A vector containing the parameters that have been set to be equal to other model parameters. |
param |
A vector containing all parameters (those optimized, those fixed to specific values, and those set to be equal to other model parameters). |
deviance |
The deviance at the maximum likelihood estimates. |
dep.summary |
A value summarizing the strength of dependence in the fitted model (see fbvevd). |
corr |
The correlation matrix. |
var.cov |
The variance covariance matrix. |
convergence, counts, message
|
Components taken from the
list returned by |
data |
The data passed to the argument |
threshold |
The argument |
n |
The number of rows in |
nat |
The vector of length three containing the number of exceedances on the first, second and both margins respectively. |
likelihood |
The argument |
sym |
The argument |
cmar |
The vector |
model |
The argument |
call |
The call of the current function. |
The standard errors and the correlation matrix in the returned
object are taken from the observed information, calculated by a
numerical approximation.
They must be interpreted with caution when either of the
marginal shape parameters are less than , because
the usual asymptotic properties of maximum likelihood estimators
do not then hold (Smith, 1985).
Chris Ferro and Alec Stephenson
Coles, S. G. (2001) An Introduction to Statistical Modelling of Extreme Values, London: Springer–Verlag.
Coles, S. G. and Tawn, J. A. (1991) Modelling multivariate extreme events. J. R. Statist. Soc. B, 53, 377–392.
Joe, H., Smith, R. L. and Weissman, I. (1992) Bivariate threshold methods for extremes. J. R. Statist. Soc. B, 54, 171–183.
Smith, R. L. (1985) Maximum likelihood estimation in a class of non-regular cases. Biometrika, 72, 67–90.
abvevd, anova.evd,
fbvevd, optim, rbvevd
bvdata <- rbvevd(1000, dep = 0.5, model = "log") u <- apply(bvdata, 2, quantile, probs = 0.9) M1 <- fbvpot(bvdata, u, model = "log") M2 <- fbvpot(bvdata, u, "log", dep = 0.5) anova(M1, M2)bvdata <- rbvevd(1000, dep = 0.5, model = "log") u <- apply(bvdata, 2, quantile, probs = 0.9) M1 <- fbvpot(bvdata, u, model = "log") M2 <- fbvpot(bvdata, u, "log", dep = 0.5) anova(M1, M2)
Maximum-likelihood fitting for the distribution of the maximum/minimum of a given number of independent variables from a specified distribution.
fextreme(x, start, densfun, distnfun, ..., distn, mlen = 1, largest = TRUE, std.err = TRUE, corr = FALSE, method = "Nelder-Mead")fextreme(x, start, densfun, distnfun, ..., distn, mlen = 1, largest = TRUE, std.err = TRUE, corr = FALSE, method = "Nelder-Mead")
x |
A numeric vector. |
start |
A named list giving the initial values for the parameters over which the likelihood is to be maximized. |
densfun, distnfun
|
Density and distribution function of the specified distribution. |
... |
Additional parameters, either for the specified
distribution or for the optimization function |
distn |
A character string, optionally specified as an alternative
to |
mlen |
The number of independent variables. |
largest |
Logical; if |
std.err |
Logical; if |
corr |
Logical; if |
method |
The optimization method (see |
Maximization of the log-likelihood is performed. The estimated standard errors are taken from the observed information, calculated by a numerical approximation.
If the density and distribution functions are user defined, the order
of the arguments must mimic those in R base (i.e. data first,
parameters second).
Density functions must have log arguments.
Returns an object of class c("extreme","evd").
The generic accessor functions fitted (or
fitted.values), std.errors,
deviance, logLik and
AIC extract various features of the
returned object.
The function anova compares nested models.
An object of class c("extreme","evd") is a list containing
at most the following components
estimate |
A vector containing the maximum likelihood estimates. |
std.err |
A vector containing the standard errors. |
deviance |
The deviance at the maximum likelihood estimates. |
corr |
The correlation matrix. |
var.cov |
The variance covariance matrix. |
convergence, counts, message
|
Components taken from the
list returned by |
call |
The call of the current function. |
data |
The data passed to the argument |
n |
The length of |
uvdata <- rextreme(100, qnorm, mean = 0.56, mlen = 365) fextreme(uvdata, list(mean = 0, sd = 1), distn = "norm", mlen = 365) fextreme(uvdata, list(rate = 1), distn = "exp", mlen = 365, method = "Brent", lower=0.01, upper=10) fextreme(uvdata, list(scale = 1), shape = 1, distn = "gamma", mlen = 365, method = "Brent", lower=0.01, upper=10) fextreme(uvdata, list(shape = 1, scale = 1), distn = "gamma", mlen = 365)uvdata <- rextreme(100, qnorm, mean = 0.56, mlen = 365) fextreme(uvdata, list(mean = 0, sd = 1), distn = "norm", mlen = 365) fextreme(uvdata, list(rate = 1), distn = "exp", mlen = 365, method = "Brent", lower=0.01, upper=10) fextreme(uvdata, list(scale = 1), shape = 1, distn = "gamma", mlen = 365, method = "Brent", lower=0.01, upper=10) fextreme(uvdata, list(shape = 1, scale = 1), distn = "gamma", mlen = 365)
Maximum-likelihood fitting for the generalized extreme value distribution, including linear modelling of the location parameter, and allowing any of the parameters to be held fixed if desired.
fgev(x, start, ..., nsloc = NULL, prob = NULL, std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)fgev(x, start, ..., nsloc = NULL, prob = NULL, std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)
x |
A numeric vector, which may contain missing values. |
start |
A named list giving the initial values for the
parameters over which the likelihood is to be maximized.
If |
... |
Additional parameters, either for the GEV model
or for the optimization function |
nsloc |
A data frame with the same number of rows as the
length of |
prob |
Controls the parameterization of the model (see
Details). Should be either |
std.err |
Logical; if |
corr |
Logical; if |
method |
The optimization method (see |
warn.inf |
Logical; if |
If prob is NULL (the default):
For stationary models the parameter names are loc, scale
and shape, for the location, scale and shape parameters
respectively.
For non-stationary models, the parameter names are loc,
locx1, ..., locxn, scale and
shape, where x1, ..., xn are the column names
of nsloc, so that loc is the intercept of the
linear model, and locx1, ..., locxn
are the ncol(nsloc) coefficients.
If nsloc is a vector it is converted into a single column
data frame with column name trend, and hence the associated
trend parameter is named loctrend.
If is a probability:
The fit is performed using a different parameterization.
Let , and denote the location, scale
and shape parameters of the GEV distribution.
For stationary models, the distribution is parameterized
using , where
is such that , where is the
GEV distribution function.
is therefore the probability in the upper
tail corresponding to the quantile .
If prob is zero, then is the upper end point
, and is restricted to the negative
(Weibull) axis.
If prob is one, then is the lower end point
, and is restricted to the positive
(Frechet) axis.
The parameter names are quantile, scale
and shape, for , and
respectively.
For non-stationary models the parameter is again given by
the equation above, but becomes the intercept of the linear
model for the location parameter, so that quantile replaces
(the intercept) loc, and hence the parameter names are
quantile, locx1, ..., locxn,
scale and shape, where x1, ..., xn are
the column names of nsloc.
In either case:
For non-stationary fitting it is recommended that the covariates
within the linear model for the location parameter are (at least
approximately) centered and scaled (i.e.\ that the columns of
nsloc are centered and scaled), particularly if automatic
starting values are used, since the starting values for the
associated parameters are then zero.
Returns an object of class c("gev","uvevd","evd").
The generic accessor functions fitted (or
fitted.values), std.errors,
deviance, logLik and
AIC extract various features of the
returned object.
The functions profile and profile2d are
used to obtain deviance profiles for the model parameters.
In particular, profiles of the quantile can be
calculated and plotted when .
The function anova compares nested models.
The function plot produces diagnostic plots.
An object of class c("gev","uvevd","evd") is a list
containing at most the following components
estimate |
A vector containing the maximum likelihood estimates. |
std.err |
A vector containing the standard errors. |
fixed |
A vector containing the parameters of the model that have been held fixed. |
param |
A vector containing all parameters (optimized and fixed). |
deviance |
The deviance at the maximum likelihood estimates. |
corr |
The correlation matrix. |
var.cov |
The variance covariance matrix. |
convergence, counts, message
|
Components taken from the
list returned by |
data |
The data passed to the argument |
tdata |
The data, transformed to stationarity (for non-stationary models). |
nsloc |
The argument |
n |
The length of |
prob |
The argument |
loc |
The location parameter. If |
call |
The call of the current function. |
The standard errors and the correlation matrix in the returned
object are taken from the observed information, calculated by a
numerical approximation.
They must be interpreted with caution when the shape parameter
is less than , because the usual asymptotic
properties of maximum likelihood estimators do not then
hold (Smith, 1985).
Smith, R. L. (1985) Maximum likelihood estimation in a class of non-regular cases. Biometrika, 72, 67–90.
anova.evd, optim,
plot.uvevd, profile.evd,
profile2d.evd
uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) trend <- (-49:50)/100 M1 <- fgev(uvdata, nsloc = trend, control = list(trace = 1)) M2 <- fgev(uvdata) M3 <- fgev(uvdata, shape = 0) M4 <- fgev(uvdata, scale = 1, shape = 0) anova(M1, M2, M3, M4) par(mfrow = c(2,2)) plot(M2) ## Not run: M2P <- profile(M2) ## Not run: plot(M2P) rnd <- runif(100, min = -.5, max = .5) fgev(uvdata, nsloc = data.frame(trend = trend, random = rnd)) fgev(uvdata, nsloc = data.frame(trend = trend, random = rnd), locrandom = 0) uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata, prob = 0.1) M2 <- fgev(uvdata, prob = 0.01) ## Not run: M1P <- profile(M1, which = "quantile") ## Not run: M2P <- profile(M2, which = "quantile") ## Not run: plot(M1P) ## Not run: plot(M2P)uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) trend <- (-49:50)/100 M1 <- fgev(uvdata, nsloc = trend, control = list(trace = 1)) M2 <- fgev(uvdata) M3 <- fgev(uvdata, shape = 0) M4 <- fgev(uvdata, scale = 1, shape = 0) anova(M1, M2, M3, M4) par(mfrow = c(2,2)) plot(M2) ## Not run: M2P <- profile(M2) ## Not run: plot(M2P) rnd <- runif(100, min = -.5, max = .5) fgev(uvdata, nsloc = data.frame(trend = trend, random = rnd)) fgev(uvdata, nsloc = data.frame(trend = trend, random = rnd), locrandom = 0) uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata, prob = 0.1) M2 <- fgev(uvdata, prob = 0.01) ## Not run: M1P <- profile(M1, which = "quantile") ## Not run: M2P <- profile(M2, which = "quantile") ## Not run: plot(M1P) ## Not run: plot(M2P)
Maximum-likelihood fitting for the maximum of two gumbel distributions, allowing any of the parameters to be held fixed if desired.
fgumbelx(x, start, ..., nsloc1 = NULL, nsloc2 = NULL, std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)fgumbelx(x, start, ..., nsloc1 = NULL, nsloc2 = NULL, std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)
x |
A numeric vector, which may contain missing values. |
start |
A named list giving the initial values for the
parameters over which the likelihood is to be maximized.
If |
... |
Additional parameters, either for the fitted model
or for the optimization function |
nsloc1 |
A data frame with the same number of rows as the
length of |
nsloc2 |
A data frame with the same number of rows as the
length of |
std.err |
Logical; if |
corr |
Logical; if |
method |
The optimization method (see |
warn.inf |
Logical; if |
For stationary models the parameter names are loc1, scale1,
loc2 and scale2 for the location and scale parameters of
two Gumbel distributions, where loc2 must be greater or equal to
loc1.
The likelihood may have multiple local optima and therefore may be difficult to fit properly; the default starting values use a moment based approach, however it is recommended that the user specify multiple different starting values and experiment with different optimization methods.
Using non-stationary models with nsloc1 and nsloc2 is not recommended due to the model complexity; the data also cannot be transformed back to stationarity so diagnostic plots will be misleading in this case.
Returns an object of class c("gumbelx","evd").
The generic accessor functions fitted (or
fitted.values), std.errors,
deviance, logLik and
AIC extract various features of the
returned object.
The functions profile and profile2d are
used to obtain deviance profiles for the model parameters.
The function anova compares nested models.
The function plot produces diagnostic plots.
An object of class c("gumbelx","evd") is a list
containing at most the following components
estimate |
A vector containing the maximum likelihood estimates. |
std.err |
A vector containing the standard errors. |
fixed |
A vector containing the parameters of the model that have been held fixed. |
param |
A vector containing all parameters (optimized and fixed). |
deviance |
The deviance at the maximum likelihood estimates. |
corr |
The correlation matrix. |
var.cov |
The variance covariance matrix. |
convergence, counts, message
|
Components taken from the
list returned by |
data |
The data passed to the argument |
nsloc1 |
The argument |
nsloc2 |
The argument |
n |
The length of |
call |
The call of the current function. |
This function is experimental and involves optimizing over a potentially complex surface.
uvdata <- rgumbelx(100, loc1 = 0, scale1 = 1, loc2 = 1, scale2 = 1) fgumbelx(uvdata, loc1 = 0, scale1 = 1)uvdata <- rgumbelx(100, loc1 = 0, scale1 = 1, loc2 = 1, scale2 = 1) fgumbelx(uvdata, loc1 = 0, scale1 = 1)
Maximum-likelihood fitting for the distribution of a selected order statistic of a given number of independent variables from a specified distribution.
forder(x, start, densfun, distnfun, ..., distn, mlen = 1, j = 1, largest = TRUE, std.err = TRUE, corr = FALSE, method = "Nelder-Mead")forder(x, start, densfun, distnfun, ..., distn, mlen = 1, j = 1, largest = TRUE, std.err = TRUE, corr = FALSE, method = "Nelder-Mead")
x |
A numeric vector. |
start |
A named list giving the initial values for the parameters over which the likelihood is to be maximized. |
densfun, distnfun
|
Density and distribution function of the specified distribution. |
... |
Additional parameters, either for the specified
distribution or for the optimization function |
distn |
A character string, optionally specified as an alternative
to |
mlen |
The number of independent variables. |
j |
The order statistic, taken as the |
largest |
Logical; if |
std.err |
Logical; if |
corr |
Logical; if |
method |
The optimization method (see |
Maximization of the log-likelihood is performed. The estimated standard errors are taken from the observed information, calculated by a numerical approximation.
If the density and distribution functions are user defined, the order
of the arguments must mimic those in R base (i.e. data first,
parameters second).
Density functions must have log arguments.
Returns an object of class c("extreme","evd").
This class is defined in fextreme.
The generic accessor functions fitted (or
fitted.values), std.errors,
deviance, logLik and
AIC extract various features of the
returned object.
The function anova compares nested models.
uvd <- rorder(100, qnorm, mean = 0.56, mlen = 365, j = 2) forder(uvd, list(mean = 0, sd = 1), distn = "norm", mlen = 365, j = 2) forder(uvd, list(rate = 1), distn = "exp", mlen = 365, j = 2, method = "Brent", lower=0.01, upper=10) forder(uvd, list(scale = 1), shape = 1, distn = "gamma", mlen = 365, j = 2, method = "Brent", lower=0.01, upper=10) forder(uvd, list(shape = 1, scale = 1), distn = "gamma", mlen = 365, j = 2)uvd <- rorder(100, qnorm, mean = 0.56, mlen = 365, j = 2) forder(uvd, list(mean = 0, sd = 1), distn = "norm", mlen = 365, j = 2) forder(uvd, list(rate = 1), distn = "exp", mlen = 365, j = 2, method = "Brent", lower=0.01, upper=10) forder(uvd, list(scale = 1), shape = 1, distn = "gamma", mlen = 365, j = 2, method = "Brent", lower=0.01, upper=10) forder(uvd, list(shape = 1, scale = 1), distn = "gamma", mlen = 365, j = 2)
The fox data frame has 33 rows and 2 columns.
The columns contain maximum annual flood discharges, in units
of 1000 cubed feet per second, from the Fox River in Wisconsin,
USA at Berlin (upstream) and Wrightstown (downstream), for the
years 1918 to 1950.
The row names give the years of observation.
foxfox
This data frame contains the following columns:
A numeric vector containing maximum annual flood discharges at Berlin (upstream).
A numeric vector containing maximum annual flood discharges at Wrightstown (downstream).
Gumbel, E. J. and Mustafi, C. K. (1967) Some analytical properties of bivariate extremal distributions. J. Amer. Statist. Assoc., 62, 569–588.
Maximum-likelihood fitting for peaks over threshold modelling, using the Generalized Pareto or Point Process representation, allowing any of the parameters to be held fixed if desired.
fpot(x, threshold, model = c("gpd", "pp"), start, npp = length(x), cmax = FALSE, r = 1, ulow = -Inf, rlow = 1, mper = NULL, ..., std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)fpot(x, threshold, model = c("gpd", "pp"), start, npp = length(x), cmax = FALSE, r = 1, ulow = -Inf, rlow = 1, mper = NULL, ..., std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)
x |
A numeric vector. If this contains missing values, those values are treated as if they fell below the threshold. |
threshold |
The threshold. |
model |
The model; either |
start |
A named list giving the initial values for the
parameters over which the likelihood is to be maximized.
If |
npp |
The data should contain |
cmax |
Logical; if |
r, ulow, rlow
|
Arguments used for the identification of
clusters of exceedences (see |
mper |
Controls the parameterization of the generalized
Pareto model. Should be either |
... |
Additional parameters, either for the model
or for the optimization function |
std.err |
Logical; if |
corr |
Logical; if |
method |
The optimization method (see |
warn.inf |
Logical; if |
The exeedances over the threshold threshold (if cmax is
FALSE) or the maxima of the clusters of exeedances (if
cmax is TRUE) are (if model = "gpd") fitted to a
generalized Pareto distribution (GPD) with location threshold.
If model = "pp" the exceedances are fitted to a
non-homogeneous Poisson process (Coles, 2001).
If mper is NULL (the default), the parameters of
the model (if model = "gpd") are scale and
shape, for the scale and shape parameters of the GPD.
If model = "pp" the parameters are loc, scale
and shape. Under model = "pp" the parameters can be
interpreted as parameters of the Generalized Extreme Value
distribution, fitted to the maxima of npp random variables.
In this case, the value of npp should be reasonably large.
For both characterizations, the shape parameters are
equivalent. The scale parameter under the generalized Pareto
characterization is equal to , where ,
and are the location, scale and shape parameters
under the Point Process characterization, and where is
the threshold.
If is a positive value, then
the generalized Pareto model is reparameterized so that the
parameters are rlevel and shape, where
rlevel is the “period” return level, where
“period” is defined via the argument npp.
The “period” return level is defined as follows.
Let be the fitted generalized Pareto distribution
function, with location , so that
is the fitted probability of an exceedance
over given an exceedance over .
The fitted probability of an exceedance over is
therefore , where is the estimated
probabilty of exceeding , which is given by the empirical
proportion of exceedances.
The “period” return level satisfies
, where is the number
of points per period (multiplied by the estimate of the
extremal index, if cluster maxima are fitted).
In other words, is the quantile of the fitted model
that corresponds to the upper tail probability .
If mper is infinite, then is the upper end point,
given by threshold minus ,
and the shape parameter is then restricted to be negative.
Returns an object of class c("pot","uvevd","pot").
The generic accessor functions fitted (or
fitted.values), std.errors,
deviance, logLik and
AIC extract various features of the
returned object.
The function profile can be
used to obtain deviance profiles for the model parameters.
In particular, profiles of the period
return level can be calculated and plotted when
.
The function anova compares nested models.
The function plot produces diagnostic plots.
An object of class c("pot","uvevd","evd") is a list containing
the following components
estimate |
A vector containing the maximum likelihood estimates. |
std.err |
A vector containing the standard errors. |
fixed |
A vector containing the parameters of the model that have been held fixed. |
param |
A vector containing all parameters (optimized and fixed). |
deviance |
The deviance at the maximum likelihood estimates. |
corr |
The correlation matrix. |
var.cov |
The variance covariance matrix. |
convergence, counts, message
|
Components taken from the
list returned by |
threshold, r, ulow, rlow, npp
|
The arguments of the same name. |
nhigh |
The number of exceedences (if |
nat, pat
|
The number and proportion of exceedences. |
extind |
The estimate of the extremal index (i.e.
|
data |
The data passed to the argument |
exceedances |
The exceedences, or the maxima of the clusters of exceedences. |
mper |
The argument |
scale |
The scale parameter for the fitted generalized Pareto
distribution. If |
call |
The call of the current function. |
The standard errors and the correlation matrix in the returned
object are taken from the observed information, calculated by a
numerical approximation.
They must be interpreted with caution when the shape parameter
is less than , because the usual asymptotic
properties of maximum likelihood estimators do not then
hold (Smith, 1985).
Smith, R. L. (1985) Maximum likelihood estimation in a class of non-regular cases. Biometrika, 72, 67–90.
anova.evd, optim,
plot.uvevd, profile.evd,
profile2d.evd, mrlplot,
tcplot
uvdata <- rgpd(100, loc = 0, scale = 1.1, shape = 0.2) M1 <- fpot(uvdata, 1) M2 <- fpot(uvdata, 1, shape = 0) anova(M1, M2) par(mfrow = c(2,2)) plot(M1) ## Not run: M1P <- profile(M1) ## Not run: plot(M1P) M1 <- fpot(uvdata, 1, mper = 10) M2 <- fpot(uvdata, 1, mper = 100) ## Not run: M1P <- profile(M1, which = "rlevel", conf=0.975, mesh=0.1) ## Not run: M2P <- profile(M2, which = "rlevel", conf=0.975, mesh=0.1) ## Not run: plot(M1P) ## Not run: plot(M2P)uvdata <- rgpd(100, loc = 0, scale = 1.1, shape = 0.2) M1 <- fpot(uvdata, 1) M2 <- fpot(uvdata, 1, shape = 0) anova(M1, M2) par(mfrow = c(2,2)) plot(M1) ## Not run: M1P <- profile(M1) ## Not run: plot(M1P) M1 <- fpot(uvdata, 1, mper = 10) M2 <- fpot(uvdata, 1, mper = 100) ## Not run: M1P <- profile(M1, which = "rlevel", conf=0.975, mesh=0.1) ## Not run: M2P <- profile(M2, which = "rlevel", conf=0.975, mesh=0.1) ## Not run: plot(M1P) ## Not run: plot(M2P)
Density function, distribution function, quantile function and random generation for the Frechet distribution with location, scale and shape parameters.
dfrechet(x, loc=0, scale=1, shape=1, log = FALSE) pfrechet(q, loc=0, scale=1, shape=1, lower.tail = TRUE) qfrechet(p, loc=0, scale=1, shape=1, lower.tail = TRUE) rfrechet(n, loc=0, scale=1, shape=1)dfrechet(x, loc=0, scale=1, shape=1, log = FALSE) pfrechet(q, loc=0, scale=1, shape=1, lower.tail = TRUE) qfrechet(p, loc=0, scale=1, shape=1, lower.tail = TRUE) rfrechet(n, loc=0, scale=1, shape=1)
x, q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
Number of observations. |
loc, scale, shape
|
Location, scale and shape parameters (can be given as vectors). |
log |
Logical; if |
lower.tail |
Logical; if |
The Frechet distribution function with parameters
, and
is
for and zero otherwise, where and
.
dfrechet gives the density function, pfrechet gives
the distribution function, qfrechet gives the quantile
function, and rfrechet generates random deviates.
dfrechet(2:4, 1, 0.5, 0.8) pfrechet(2:4, 1, 0.5, 0.8) qfrechet(seq(0.9, 0.6, -0.1), 2, 0.5, 0.8) rfrechet(6, 1, 0.5, 0.8) p <- (1:9)/10 pfrechet(qfrechet(p, 1, 2, 0.8), 1, 2, 0.8) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9dfrechet(2:4, 1, 0.5, 0.8) pfrechet(2:4, 1, 0.5, 0.8) qfrechet(seq(0.9, 0.6, -0.1), 2, 0.5, 0.8) rfrechet(6, 1, 0.5, 0.8) p <- (1:9)/10 pfrechet(qfrechet(p, 1, 2, 0.8), 1, 2, 0.8) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Density function, distribution function, quantile function and random generation for the generalized extreme value (GEV) distribution with location, scale and shape parameters.
dgev(x, loc=0, scale=1, shape=0, log = FALSE) pgev(q, loc=0, scale=1, shape=0, lower.tail = TRUE) qgev(p, loc=0, scale=1, shape=0, lower.tail = TRUE) rgev(n, loc=0, scale=1, shape=0)dgev(x, loc=0, scale=1, shape=0, log = FALSE) pgev(q, loc=0, scale=1, shape=0, lower.tail = TRUE) qgev(p, loc=0, scale=1, shape=0, lower.tail = TRUE) rgev(n, loc=0, scale=1, shape=0)
x, q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
Number of observations. |
loc, scale, shape
|
Location, scale and shape parameters; the
|
log |
Logical; if |
lower.tail |
Logical; if |
The GEV distribution function with parameters
, and
is
for , where .
If the distribution is defined by continuity.
If , the value is
either greater than the upper end point (if ), or less
than the lower end point (if ).
The parametric form of the GEV encompasses that of the Gumbel,
Frechet and reverse Weibull distributions, which are obtained
for , and respectively.
It was first introduced by Jenkinson (1955).
dgev gives the density function, pgev gives the
distribution function, qgev gives the quantile function,
and rgev generates random deviates.
Jenkinson, A. F. (1955) The frequency distribution of the annual maximum (or minimum) of meteorological elements. Quart. J. R. Met. Soc., 81, 158–171.
fgev, rfrechet,
rgumbel, rrweibull
dgev(2:4, 1, 0.5, 0.8) pgev(2:4, 1, 0.5, 0.8) qgev(seq(0.9, 0.6, -0.1), 2, 0.5, 0.8) rgev(6, 1, 0.5, 0.8) p <- (1:9)/10 pgev(qgev(p, 1, 2, 0.8), 1, 2, 0.8) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9dgev(2:4, 1, 0.5, 0.8) pgev(2:4, 1, 0.5, 0.8) qgev(seq(0.9, 0.6, -0.1), 2, 0.5, 0.8) rgev(6, 1, 0.5, 0.8) p <- (1:9)/10 pgev(qgev(p, 1, 2, 0.8), 1, 2, 0.8) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Density function, distribution function, quantile function and random generation for the generalized Pareto distribution (GPD) with location, scale and shape parameters.
dgpd(x, loc=0, scale=1, shape=0, log = FALSE) pgpd(q, loc=0, scale=1, shape=0, lower.tail = TRUE) qgpd(p, loc=0, scale=1, shape=0, lower.tail = TRUE) rgpd(n, loc=0, scale=1, shape=0)dgpd(x, loc=0, scale=1, shape=0, log = FALSE) pgpd(q, loc=0, scale=1, shape=0, lower.tail = TRUE) qgpd(p, loc=0, scale=1, shape=0, lower.tail = TRUE) rgpd(n, loc=0, scale=1, shape=0)
x, q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
Number of observations. |
loc, scale, shape
|
Location, scale and shape parameters; the
|
log |
Logical; if |
lower.tail |
Logical; if |
The generalized Pareto distribution function (Pickands, 1975) with
parameters , and
is
for and , where .
If the distribution is defined by continuity.
dgpd gives the density function, pgpd gives the
distribution function, qgpd gives the quantile function,
and rgpd generates random deviates.
Pickands, J. (1975) Statistical inference using extreme order statistics. Annals of Statistics, 3, 119–131.
dgpd(2:4, 1, 0.5, 0.8) pgpd(2:4, 1, 0.5, 0.8) qgpd(seq(0.9, 0.6, -0.1), 2, 0.5, 0.8) rgpd(6, 1, 0.5, 0.8) p <- (1:9)/10 pgpd(qgpd(p, 1, 2, 0.8), 1, 2, 0.8) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9dgpd(2:4, 1, 0.5, 0.8) pgpd(2:4, 1, 0.5, 0.8) qgpd(seq(0.9, 0.6, -0.1), 2, 0.5, 0.8) rgpd(6, 1, 0.5, 0.8) p <- (1:9)/10 pgpd(qgpd(p, 1, 2, 0.8), 1, 2, 0.8) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Density function, distribution function, quantile function and random generation for the Gumbel distribution with location and scale parameters.
dgumbel(x, loc=0, scale=1, log = FALSE) pgumbel(q, loc=0, scale=1, lower.tail = TRUE) qgumbel(p, loc=0, scale=1, lower.tail = TRUE) rgumbel(n, loc=0, scale=1)dgumbel(x, loc=0, scale=1, log = FALSE) pgumbel(q, loc=0, scale=1, lower.tail = TRUE) qgumbel(p, loc=0, scale=1, lower.tail = TRUE) rgumbel(n, loc=0, scale=1)
x, q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
Number of observations. |
loc, scale
|
Location and scale parameters (can be given as vectors). |
log |
Logical; if |
lower.tail |
Logical; if |
The Gumbel distribution function with parameters
and is
for all real , where .
dgumbel gives the density function, pgumbel gives
the distribution function, qgumbel gives the quantile
function, and rgumbel generates random deviates.
dgumbel(-1:2, -1, 0.5) pgumbel(-1:2, -1, 0.5) qgumbel(seq(0.9, 0.6, -0.1), 2, 0.5) rgumbel(6, -1, 0.5) p <- (1:9)/10 pgumbel(qgumbel(p, -1, 2), -1, 2) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9dgumbel(-1:2, -1, 0.5) pgumbel(-1:2, -1, 0.5) qgumbel(seq(0.9, 0.6, -0.1), 2, 0.5) rgumbel(6, -1, 0.5) p <- (1:9)/10 pgumbel(qgumbel(p, -1, 2), -1, 2) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Density function, distribution function, quantile function and random generation for the maxima of two Gumbel distributions, each with different location and scale parameters.
dgumbelx(x, loc1=0, scale1=1, loc2=0, scale2=1, log = FALSE) pgumbelx(q, loc1=0, scale1=1, loc2=0, scale2=1, lower.tail = TRUE) qgumbelx(p, interval, loc1=0, scale1=1, loc2=0, scale2=1, lower.tail = TRUE, ...) rgumbelx(n, loc1=0, scale1=1, loc2=0, scale2=1)dgumbelx(x, loc1=0, scale1=1, loc2=0, scale2=1, log = FALSE) pgumbelx(q, loc1=0, scale1=1, loc2=0, scale2=1, lower.tail = TRUE) qgumbelx(p, interval, loc1=0, scale1=1, loc2=0, scale2=1, lower.tail = TRUE, ...) rgumbelx(n, loc1=0, scale1=1, loc2=0, scale2=1)
x, q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
Number of observations. |
interval |
A length two vector containing the end-points of the interval to be searched for the quantiles, passed to the uniroot function. |
loc1, scale1, loc2, scale2
|
Location and scale parameters of the two Gumbel distributions. The second location parameter must be greater than or equal to the first location parameter. |
log |
Logical; if |
lower.tail |
Logical; if |
... |
Other arguments passed to uniroot. |
dgumbelx gives the density function, pgumbelx gives the
distribution function, qgumbelx gives the quantile function,
and rgumbelx generates random deviates.
fgev, rfrechet,
rgumbel, rrweibull, uniroot
dgumbelx(2:4, 0, 1.1, 1, 0.5) pgumbelx(2:4, 0, 1.1, 1, 0.5) qgumbelx(seq(0.9, 0.6, -0.1), interval = c(0,10), 0, 1.2, 2, 0.5) rgumbelx(6, 0, 1.1, 1, 0.5) p <- (1:9)/10 pgumbelx(qgumbelx(p, interval = c(0,10), 0, 0.5, 1, 2), 0, 0.5, 1, 2) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9dgumbelx(2:4, 0, 1.1, 1, 0.5) pgumbelx(2:4, 0, 1.1, 1, 0.5) qgumbelx(seq(0.9, 0.6, -0.1), interval = c(0,10), 0, 1.2, 2, 0.5) rgumbelx(6, 0, 1.1, 1, 0.5) p <- (1:9)/10 pgumbelx(qgumbelx(p, interval = c(0,10), 0, 0.5, 1, 2), 0, 0.5, 1, 2) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Calculate or plot the density of the spectral measure
on the interval , for nine parametric
bivariate extreme value models.
hbvevd(x = 0.5, dep, asy = c(1,1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), half = FALSE, plot = FALSE, add = FALSE, lty = 1, ...)hbvevd(x = 0.5, dep, asy = c(1,1), alpha, beta, model = c("log", "alog", "hr", "neglog", "aneglog", "bilog", "negbilog", "ct", "amix"), half = FALSE, plot = FALSE, add = FALSE, lty = 1, ...)
x |
A vector of values at which the function is evaluated
(ignored if plot or add is |
dep |
Dependence parameter for the logistic, asymmetric logistic, Husler-Reiss, negative logistic and asymmetric negative logistic models. |
asy |
A vector of length two, containing the two asymmetry parameters for the asymmetric logistic and asymmetric negative logistic models. |
alpha, beta
|
Alpha and beta parameters for the bilogistic, negative bilogistic, Coles-Tawn and asymmetric mixed models. |
model |
The specified model; a character string. Must be
either |
half |
Logical; if |
plot |
Logical; if |
add |
Logical; add to an existing plot? |
lty |
Line type. |
... |
Other high-level graphics parameters to be passed to
|
Any bivariate extreme value distribution can be written as
for some function defined on ,
satisfying
In particular, the total mass of H is two.
The functions and are as defined in
abvevd.
H is called the spectral measure, with density on
the interval .
hbvevd calculates or plots the spectral density function
for one of nine parametric bivariate extreme value models,
at specified parameter values.
For differentiable models H may have up to two point masses:
at zero and one. Assuming that the model parameters are in the
interior of the parameter space, we have the following. For the
asymmetric logistic and asymmetric negative logistic models the
point masses are of size 1-asy1 and 1-asy2
respectively. For the asymmetric mixed model they are of size
1-alpha-beta and 1-alpha-2*beta respectively. For
all other models the point masses are zero.
At independence, H has point masses of size one at both
zero and one. At complete dependence [a non-differentiable
model] H has a single point mass of size two at .
In either case, is zero everywhere.
abvevd, fbvevd,
rbvevd, plot.bvevd
hbvevd(dep = 2.7, model = "hr") hbvevd(seq(0.25,0.5,0.75), dep = 0.3, asy = c(.7,.9), model = "alog") hbvevd(alpha = 0.3, beta = 1.2, model = "negbi", plot = TRUE) bvdata <- rbvevd(100, dep = 0.7, model = "log") M1 <- fitted(fbvevd(bvdata, model = "log")) hbvevd(dep = M1["dep"], model = "log", plot = TRUE)hbvevd(dep = 2.7, model = "hr") hbvevd(seq(0.25,0.5,0.75), dep = 0.3, asy = c(.7,.9), model = "alog") hbvevd(alpha = 0.3, beta = 1.2, model = "negbi", plot = TRUE) bvdata <- rbvevd(100, dep = 0.7, model = "log") M1 <- fitted(fbvevd(bvdata, model = "log")) hbvevd(dep = M1["dep"], model = "log", plot = TRUE)
A numeric vector containing annual maximum wind speeds, in kilometers per hour, from 1941 to 1970 at Lisbon, Portugal.
lisbonlisbon
A vector containing 30 observations.
Tiago de Oliveira, J. (1997) Statistical Analysis of Extremes. Pendor.
The lossalae data frame has 1500 rows and 2 columns.
The columns contain the indemnity payment (loss), and
the allocated loss adjustment expense (alae), both in USD.
The latter is the additional expenses associated with the
settlement of the claim (e.g. claims investigation expenses
and legal fees).
The dataset also has an attribute called capped, which
gives the row names of the indemnity payments that were capped
at their policy limit.
lossalaelossalae
This data frame contains the following columns:
A numeric vector containing the indemnity payments.
A numeric vector containing the allocated loss adjustment expenses.
Frees, E. W. and Valdez, E. A. (1998) Understanding relationships using copulas. North American Actuarial Journal, 2, 1–15.
Klugman, S. A. and Parsa, R. (1999) Fitting bivariate loss distributions with copulas. Insurance: Mathematics and Economics, 24, 139–148.
Beirlant, J., Goegebeur, Y., Segers, J. and Teugels, J. L. (2004) Statistics of Extremes: Theory and Applications., Chichester, England: John Wiley and Sons.
Simulation of MARMA(p,q) processes.
marma(n, p = 0, q = 0, psi, theta, init = rep(0, p), n.start = p, rand.gen = rfrechet, ...) mar(n, p = 1, psi, init = rep(0, p), n.start = p, rand.gen = rfrechet, ...) mma(n, q = 1, theta, rand.gen = rfrechet, ...)marma(n, p = 0, q = 0, psi, theta, init = rep(0, p), n.start = p, rand.gen = rfrechet, ...) mar(n, p = 1, psi, init = rep(0, p), n.start = p, rand.gen = rfrechet, ...) mma(n, q = 1, theta, rand.gen = rfrechet, ...)
n |
The number of observations. |
p |
The AR order of the MARMA process. |
q |
The MA order of the MARMA process. |
psi |
A vector of non-negative parameters, of length
|
theta |
A vector of non-negative parameters, of length
|
init |
A vector of non-negative starting values, of
length |
n.start |
A non-negative value denoting the length of the
burn-in period. If |
rand.gen |
A simulation function to generate the innovations. |
... |
Additional arguments for |
A max autoregressive moving average process ,
denoted by MARMA(p,q), is defined in Davis and Resnick (1989) as
satisfying
where
and
are non-negative vectors of parameters, and where
is a series of iid
random variables with a common distribution defined by
rand.gen.
The functions mar and mma generate MAR(p) and
MMA(q) processes respectively.
A MAR(p) process is equivalent to a
MARMA(p, 0) process, so that
A MMA(q) process is equivalent to a
MARMA(0, q) process, so that
A numeric vector of length n.
Davis, R. A. and Resnick, S. I. (1989) Basic properties and prediction of max-arma processes. Adv. Appl. Prob., 21, 781–803.
marma(100, p = 1, q = 1, psi = 0.75, theta = 0.65) mar(100, psi = 0.85, n.start = 20) mma(100, q = 2, theta = c(0.75, 0.8))marma(100, p = 1, q = 1, psi = 0.75, theta = 0.65) mar(100, psi = 0.85, n.start = 20) mma(100, q = 2, theta = c(0.75, 0.8))
The empirical mean residual life plot.
mrlplot(data, tlim, pscale = FALSE, nt = max(100, length(data)), lty = c(2,1,2), col = 1, conf = 0.95, main = "Mean Residual Life Plot", xlab = "Threshold", ylab = "Mean Excess", ...)mrlplot(data, tlim, pscale = FALSE, nt = max(100, length(data)), lty = c(2,1,2), col = 1, conf = 0.95, main = "Mean Residual Life Plot", xlab = "Threshold", ylab = "Mean Excess", ...)
data |
A numeric vector. |
tlim |
A numeric vector of length two, giving the limits for
the thresholds at which the mean residual life plot is
evaluated. If |
pscale |
If |
nt |
The number of thresholds at which the mean residual life plot is evaluated. |
lty, col
|
Arguments passed to |
conf |
The (pointwise) confidence coefficient for the plotted confidence intervals. |
main |
Plot title. |
xlab, ylab
|
x and y axis labels. |
... |
Other arguments to be passed to |
The empirical mean residual life plot is the locus of points
where are
the observations that exceed the threshold .
If the exceedances of a threshold
are generalized Pareto, the empirical mean residual life plot
should be approximately linear for .
The confidence intervals within the plot are symmetric intervals based on the approximate normality of sample means.
A list with components x and y is invisibly returned.
The components contain those objects that were passed to the formal
arguments x and y of matplot in order to create
the mean residual life plot.
Stuart Coles and Alec Stephenson
mrlplot(portpirie)mrlplot(portpirie)
Transforms to exponential margins under the GEV model.
mtransform(x, p, inv = FALSE, drp = FALSE)mtransform(x, p, inv = FALSE, drp = FALSE)
x |
A matrix with n rows and d columns, or a vector. In
the latter case, if |
p |
A vector of length three or a matrix with n rows and three columns. It can also be a list of length d, in which case each element must be a vector of length three or a matrix with n rows and three columns. |
inv |
Logical; use the inverse transformation? |
drp |
Logical; return a vector rather than a single row matrix?. Note that a single column matrix is always returned as a vector. |
Let denote a vector of observations for
.
This function implements the transformation
to each column of the matrix x.
The values are contained in the ith
row of the n by 3 matrix p. If p is a vector
of length three, the parameters are the same for every
. Alternatively, p can be a list
with d elements, in which case the jth element is used to
transform the jth column of x.
This function is mainly for internal use. It is used by bivariate and multivariate routines to calculate marginal transformations.
A numeric matrix or vector.
Density function, distribution function and random generation for the multivariate logistic and multivariate asymmetric logistic models.
pmvevd(q, dep, asy, model = c("log", "alog"), d = 2, mar = c(0,1,0), lower.tail = TRUE) rmvevd(n, dep, asy, model = c("log", "alog"), d = 2, mar = c(0,1,0)) dmvevd(x, dep, asy, model = c("log", "alog"), d = 2, mar = c(0,1,0), log = FALSE)pmvevd(q, dep, asy, model = c("log", "alog"), d = 2, mar = c(0,1,0), lower.tail = TRUE) rmvevd(n, dep, asy, model = c("log", "alog"), d = 2, mar = c(0,1,0)) dmvevd(x, dep, asy, model = c("log", "alog"), d = 2, mar = c(0,1,0), log = FALSE)
x, q
|
A vector of length |
n |
Number of observations. |
dep |
The dependence parameter(s). For the logistic model,
should be a single value. For the asymmetric logistic model,
should be a vector of length |
asy |
The asymmetry parameters for the asymmetric logistic
model. Should be a list with |
model |
The specified model; a character string. Must be either
|
d |
The dimension. |
mar |
A vector of length three containing marginal parameters
for every univariate margin, or a matrix with three columns where
each column represents a vector of values to be passed to the
corresponding marginal parameter. It can also be a list with
|
log |
Logical; if |
lower.tail |
Logical; if |
Define
for and
, where the marginal
parameters are given by
, .
If then is defined by
continuity.
Let .
In each of the multivariate distributions functions
given below, the
univariate margins are generalized extreme value, so that
for
.
If for some
, the value is
either greater than the upper end point (if ),
or less than the lower end point (if ), of the
th univariate marginal distribution.
model = "log" (Gumbel, 1960)
The d dimensional multivariate logistic distribution
function with parameter is
where .
This is a special case of the multivariate asymmetric logistic
model.
model = "alog" (Tawn, 1990)
Let be the set of all non-empty subsets of
, let
, where
denotes the number of elements in the set , and let
.
The d dimensional multivariate asymmetric logistic distribution
function is
where the dependence parameters for
all , and the asymmetry parameters
for all
and .
The constraints
for
ensure that the marginal distributions are generalized extreme value.
Further constraints arise from the possible redundancy of asymmetry
parameters in the expansion of the distribution form.
Let .
If for some
then
for all .
Furthermore, if for some
,
for all
, then
.
dep should be a vector of length which contains
, with
the order defined by the natural set ordering on the index.
For example, for the trivariate model,
.
asy should be a list with elements.
Each element is a vector which corresponds to a set
, containing for
every integer .
The elements should be given using the natural set ordering on the
, so that the first elements are vectors
of length one corresponding to the sets
, and the last element is a
a vector of length , corresponding to the set
.
asy must be constructed to ensure that all constraints are
satisfied or an error will occur.
pmvevd gives the distribution function, dmvevd gives
the density function and rmvevd generates random deviates, for
the multivariate logistic or multivariate asymmetric logistic model.
Multivariate extensions of other bivariate models are more complex. A multivariate extension of the Husler-Reiss model exists, involving a multidimensional integral and one parameter for each bivariate margin. Multivariate extensions for the negative logistic model can be derived but are considerably more complex and appear to be less flexible. The “multivariate negative logistic model” often presented in the literature (e.g. Kotz et al, 2000) is not a valid distribution function and should not be used.
The logistic and asymmetric logistic models respectively are simulated using Algorithms 2.1 and 2.2 in Stephenson(2003b).
The density function of the logistic model is evaluated using the representation of Shi(1995). The density function of the asymmetric logistic model is evaluated using the representation given in Stephenson(2003a).
Gumbel, E. J. (1960) Distributions des valeurs extremes en plusieurs dimensions. Publ. Inst. Statist. Univ. Paris, 9, 171–173.
Kotz, S. and Balakrishnan, N. and Johnson, N. L. (2000) Continuous Multivariate Distributions, vol. 1. New York: John Wiley & Sons, 2nd edn.
Shi, D. (1995) Fisher information for a multivariate extreme value distribution. Biometrika, 82(3), 644–649.
Stephenson, A. G. (2003a) Extreme Value Distributions and their Application. Ph.D. Thesis, Lancaster University, Lancaster, UK.
Stephenson, A. G. (2003b) Simulating multivariate extreme value distributions of logistic type. Extremes, 6(1), 49–60.
Tawn, J. A. (1990) Modelling multivariate extreme value distributions. Biometrika, 77, 245–253.
pmvevd(matrix(rep(0:4,5), ncol=5), dep = .7, model = "log", d = 5) pmvevd(rep(4,5), dep = .7, model = "log", d = 5) rmvevd(10, dep = .7, model = "log", d = 5) dmvevd(rep(-1,20), dep = .7, model = "log", d = 20, log = TRUE) asy <- list(.4, .1, .6, c(.3,.2), c(.1,.1), c(.4,.1), c(.2,.3,.2)) pmvevd(rep(2,3), dep = c(.6,.5,.8,.3), asy = asy, model = "alog", d = 3) asy <- list(.4, .0, .6, c(.3,.2), c(.1,.1), c(.4,.1), c(.2,.4,.2)) rmvevd(10, dep = c(.6,.5,.8,.3), asy = asy, model = "alog", d = 3) dmvevd(rep(0,3), dep = c(.6,.5,.8,.3), asy = asy, model = "alog", d = 3) asy <- list(0, 0, 0, 0, c(0,0), c(0,0), c(0,0), c(0,0), c(0,0), c(0,0), c(.2,.1,.2), c(.1,.1,.2), c(.3,.4,.1), c(.2,.2,.2), c(.4,.6,.2,.5)) rmvevd(10, dep = .7, asy = asy, model = "alog", d = 4) rmvevd(10, dep = c(rep(1,6), rep(.7,5)), asy = asy, model = "alog", d = 4)pmvevd(matrix(rep(0:4,5), ncol=5), dep = .7, model = "log", d = 5) pmvevd(rep(4,5), dep = .7, model = "log", d = 5) rmvevd(10, dep = .7, model = "log", d = 5) dmvevd(rep(-1,20), dep = .7, model = "log", d = 20, log = TRUE) asy <- list(.4, .1, .6, c(.3,.2), c(.1,.1), c(.4,.1), c(.2,.3,.2)) pmvevd(rep(2,3), dep = c(.6,.5,.8,.3), asy = asy, model = "alog", d = 3) asy <- list(.4, .0, .6, c(.3,.2), c(.1,.1), c(.4,.1), c(.2,.4,.2)) rmvevd(10, dep = c(.6,.5,.8,.3), asy = asy, model = "alog", d = 3) dmvevd(rep(0,3), dep = c(.6,.5,.8,.3), asy = asy, model = "alog", d = 3) asy <- list(0, 0, 0, 0, c(0,0), c(0,0), c(0,0), c(0,0), c(0,0), c(0,0), c(.2,.1,.2), c(.1,.1,.2), c(.3,.4,.1), c(.2,.2,.2), c(.4,.6,.2,.5)) rmvevd(10, dep = .7, asy = asy, model = "alog", d = 4) rmvevd(10, dep = c(rep(1,6), rep(.7,5)), asy = asy, model = "alog", d = 4)
The ocmulgee data frame has 40 rows and 2 columns.
The columns contain maximum annual flood discharges, in units
of 1000 cubed feet per second, from the Ocmulgee River in
Georgia, USA at Hawkinsville (upstream) and Macon (downstream),
for the years 1910 to 1949.
The row names give the years of observation.
ocmulgeeocmulgee
This data frame contains the following columns:
A numeric vector containing maximum annual flood discharges at Hawkinsville (upstream).
A numeric vector containing maximum annual flood discharges at Macon (downstream).
Gumbel, E. J. and Goldstein, N. (1964) Analysis of empirical bivariate extremal distributions. J. Amer. Statist. Assoc., 59, 794–816.
The oldage data frame has 66 rows and 2 columns.
The columns contain the oldest ages at death for men and women in
Sweden, for the period 1905–1970.
The row names give the years of observation.
oldageoldage
This data frame contains the following columns:
A numeric vector containing the oldest ages at death for men.
A numeric vector containing the oldest ages at death for women.
Fransen, A. and Tiago de Oliveira, J. (1984) Statistical choice of univariate extreme models, part II, in Statistical Extremes and Applications, J. Tiago de Oliveira ed., 373–394, D. Reidel, Dordrect.
Density function, distribution function and random generation for a selected order statistic of a given number of independent variables from a specified distribution.
dorder(x, densfun, distnfun, ..., distn, mlen = 1, j = 1, largest = TRUE, log = FALSE) porder(q, distnfun, ..., distn, mlen = 1, j = 1, largest = TRUE, lower.tail = TRUE) rorder(n, quantfun, ..., distn, mlen = 1, j = 1, largest = TRUE)dorder(x, densfun, distnfun, ..., distn, mlen = 1, j = 1, largest = TRUE, log = FALSE) porder(q, distnfun, ..., distn, mlen = 1, j = 1, largest = TRUE, lower.tail = TRUE) rorder(n, quantfun, ..., distn, mlen = 1, j = 1, largest = TRUE)
x, q
|
Vector of quantiles. |
n |
Number of observations. |
densfun, distnfun, quantfun
|
Density, distribution and
quantile function of the specified distribution. The density
function must have a |
... |
Parameters of the specified distribution. |
distn |
A character string, optionally specified as an
alternative to |
mlen |
The number of independent variables. |
j |
The order statistic, taken as the |
largest |
Logical; if |
log |
Logical; if |
lower.tail |
Logical; if |
dorder gives the density function, porder gives the
distribution function and qorder gives the quantile function
of a selected order statistic from a sample of size mlen,
from a specified distibution. rorder generates random deviates.
dorder(2:4, dnorm, pnorm, mean = 0.5, sd = 1.2, mlen = 5, j = 2) dorder(2:4, distn = "norm", mean = 0.5, sd = 1.2, mlen = 5, j = 2) dorder(2:4, distn = "exp", mlen = 2, j = 2) porder(2:4, distn = "exp", rate = 1.2, mlen = 2, j = 2) rorder(5, qgamma, shape = 1, mlen = 10, j = 2)dorder(2:4, dnorm, pnorm, mean = 0.5, sd = 1.2, mlen = 5, j = 2) dorder(2:4, distn = "norm", mean = 0.5, sd = 1.2, mlen = 5, j = 2) dorder(2:4, distn = "exp", mlen = 2, j = 2) porder(2:4, distn = "exp", rate = 1.2, mlen = 2, j = 2) rorder(5, qgamma, shape = 1, mlen = 10, j = 2)
A numeric vector containing annual maximum temperatures, in degrees Fahrenheit, from 1901 to 1980 at Oxford, England.
oxfordoxford
A vector containing 80 observations.
Tabony, R. C. (1983) Extreme value analysis in meteorology. The Meteorological Magazine 112, 77–98.
Six plots (selectable by which) are currently provided:
two conditional P-P plots (1,2), conditioning on each margin, a
density plot (3), a dependence function plot (4), a quantile
curves plot (5) and a spectral density plot (6).
Plot diagnostics for the generalized extreme value margins
(selectable by mar and which) are also available.
## S3 method for class 'bvevd' plot(x, mar = 0, which = 1:6, main, ask = nb.fig < length(which) && dev.interactive(), ci = TRUE, cilwd = 1, a = 0, grid = 50, legend = TRUE, nplty = 2, blty = 3, method = "cfg", convex = FALSE, rev = FALSE, p = seq(0.75, 0.95, 0.05), mint = 1, half = FALSE, ...)## S3 method for class 'bvevd' plot(x, mar = 0, which = 1:6, main, ask = nb.fig < length(which) && dev.interactive(), ci = TRUE, cilwd = 1, a = 0, grid = 50, legend = TRUE, nplty = 2, blty = 3, method = "cfg", convex = FALSE, rev = FALSE, p = seq(0.75, 0.95, 0.05), mint = 1, half = FALSE, ...)
x |
An object of class |
mar |
If |
which |
A subset of the numbers |
main |
Title of each plot. If given, should be a
character vector with the same length as |
ask |
Logical; if |
ci |
Logical; if |
cilwd |
Line width for confidence interval lines. |
a |
Passed through to |
grid |
Argument for the density plot. The (possibly
transformed) data is plotted with a contour plot of the
bivariate density of the fitted model. The density is evaluated
at |
legend |
If |
method, convex, rev
|
Arguments to the dependence function
plot. The dependence function for the fitted model is plotted and
(optionally) compared to a non-parameteric estimate. See
|
nplty, blty
|
Line types for the dependence function plot.
|
p, mint
|
Arguments to the quantile curves plot. See
|
half |
Argument to the spectral density plot. See
|
... |
Other arguments to be passed through to plotting functions. |
In all plots we assume that the fitted model is stationary. For non-stationary models the data are transformed to stationarity. The plot then corresponds to the distribution obtained when all covariates are zero. In particular, the density and quanitle curves plots will not plot the original data for non-stationary models.
A conditional P-P plot is a P-P plot for the condition
distribution function of a bivariate evd object.
Let be the conditional distribution of
the first margin given the second, under the fitted model.
Let be the data used in the fitted model,
where for .
The plot that (by default) is labelled Conditional Plot Two,
conditioning on the second margin, consists of the points
where are plotting points defined by
ppoints and is the th largest
value from the sample
The margins are reversed for Conditional Plot One, so that
is the conditional distribution of the second
margin given the first.
plot.uvevd, contour,
jitter, abvnonpar,
qcbvnonpar
bvdata <- rbvevd(100, dep = 0.6, model = "log") M1 <- fbvevd(bvdata, model = "log") ## Not run: par(mfrow = c(2,2)) ## Not run: plot(M1, which = 1:5) ## Not run: plot(M1, mar = 1) ## Not run: plot(M1, mar = 2)bvdata <- rbvevd(100, dep = 0.6, model = "log") M1 <- fbvevd(bvdata, model = "log") ## Not run: par(mfrow = c(2,2)) ## Not run: plot(M1, which = 1:5) ## Not run: plot(M1, mar = 1) ## Not run: plot(M1, mar = 2)
Four plots (selectable by which) are currently provided:
a density plot (1), a dependence function plot (2), a quantile
curves plot (3) and a spectral density plot (4).
Plot diagnostics for the generalized Pareto peaks-over-threshold
margins (selectable by mar and which) are also
available.
## S3 method for class 'bvpot' plot(x, mar = 0, which = 1:4, main, ask = nb.fig < length(which) && dev.interactive(), grid = 50, above = FALSE, levels = NULL, tlty = 1, blty = 3, rev = FALSE, p = seq(0.75, 0.95, 0.05), half = FALSE, ...)## S3 method for class 'bvpot' plot(x, mar = 0, which = 1:4, main, ask = nb.fig < length(which) && dev.interactive(), grid = 50, above = FALSE, levels = NULL, tlty = 1, blty = 3, rev = FALSE, p = seq(0.75, 0.95, 0.05), half = FALSE, ...)
x |
An object of class |
mar |
If |
which |
A subset of the numbers |
main |
Title of each plot. If given, should be a
character vector with the same length as |
ask |
Logical; if |
grid, levels
|
Arguments for the density plot. The
data is plotted with a contour plot of the bivariate density
of the fitted model in the tail region. The density is evaluated
at |
above |
Logical; if |
tlty |
Line type for the lines identifying the thresholds. |
rev, blty
|
Arguments to the dependence function
plot. See |
p |
Lower tail probabilities for the quantile curves plot.
The plot is of the same type as given by the function
|
half |
Argument to the spectral density plot. See
|
... |
Other arguments to be passed through to plotting functions. |
plot.bvevd, contour,
abvnonpar, qcbvnonpar, hbvevd
bvdata <- rbvevd(500, dep = 0.6, model = "log") M1 <- fbvpot(bvdata, threshold = c(0,0), model = "log") ## Not run: plot(M1) ## Not run: plot(M1, mar = 1) ## Not run: plot(M1, mar = 2)bvdata <- rbvevd(500, dep = 0.6, model = "log") M1 <- fbvpot(bvdata, threshold = c(0,0), model = "log") ## Not run: plot(M1) ## Not run: plot(M1, mar = 1) ## Not run: plot(M1, mar = 2)
Displays profile log-likelihoods from a model profiled with
profile.evd.
## S3 method for class 'profile.evd' plot(x, which = names(x), main = NULL, ask = nb.fig < length(which) && dev.interactive(), ci = 0.95, clty = 2, ...)## S3 method for class 'profile.evd' plot(x, which = names(x), main = NULL, ask = nb.fig < length(which) && dev.interactive(), ci = 0.95, clty = 2, ...)
x |
An object of class |
which |
A character vector giving the parameters for which
the profile deviance is plotted, and for which profile confidence
intervals are calculated. By default all profiled parameters in
|
main |
Title of each plot; a character vector, the
same length as |
ask |
Logical; if |
ci |
A numeric vector. For each parameter in |
clty |
The line type of the horizontal lines that represent
the profile confidence intervals. To omit the lines set
|
... |
Other graphics parameters. |
Profile devainces are plotted for each parameter in
which. For calculation of profile confidence intervals,
use the confint.profile.evd function.
The profile confidence intervals may not have confidence coefficient
ci, because the usual asymptotic properties of maximum
likelihood estimators may not hold.
For the GEV model, the usual asymptotic properties hold when the
shape parameter is greater than (Smith, 1985).
Smith, R. L. (1985) Maximum likelihood estimation in a class of non-regular cases. Biometrika, 72, 67–90.
confint.profile.evd, plot.profile2d.evd,
profile.evd, profile2d.evd
uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: M1P <- profile(M1) ## Not run: par(mfrow = c(2,2)) ## Not run: cint <- plot(M1P, ci = c(0.95, 0.99)) ## Not run: cintuvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: M1P <- profile(M1) ## Not run: par(mfrow = c(2,2)) ## Not run: cint <- plot(M1P, ci = c(0.95, 0.99)) ## Not run: cint
Displays an image plot of the joint profile log-likelihood
from a model profiled with profile.evd and
profile2d.evd.
## S3 method for class 'profile2d.evd' plot(x, main = NULL, ci = c(0.5, 0.8, 0.9, 0.95, 0.975, 0.99, 0.995), col = heat.colors(8), intpts = 75, xaxs = "r", yaxs = "r", ...)## S3 method for class 'profile2d.evd' plot(x, main = NULL, ci = c(0.5, 0.8, 0.9, 0.95, 0.975, 0.99, 0.995), col = heat.colors(8), intpts = 75, xaxs = "r", yaxs = "r", ...)
x |
An object of class |
main |
Title of plot; a character string. |
ci |
A numeric vector whose length is one less than the
length of |
col |
A list of colors such as that generated by
|
intpts |
If the package interp is available,
interpolation is performed using |
xaxs, yaxs
|
Graphics parameters (see |
... |
Other parameters to be passed to |
The sets represented by different colours may not be
confidence sets with confidence coefficients ci, because
the usual asymptotic properties of maximum likelihood estimators
may not hold.
For the GEV model, the usual asymptotic properties hold when the
shape parameter is greater than (Smith, 1985).
Smith, R. L. (1985) Maximum likelihood estimation in a class of non-regular cases. Biometrika, 72, 67–90.
plot.profile.evd, profile.evd,
profile2d.evd
uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: M1P <- profile(M1) ## Not run: M1JP <- profile2d(M1, M1P, which = c("scale", "shape")) ## Not run: plot(M1JP)uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: M1P <- profile(M1) ## Not run: M1JP <- profile2d(M1, M1P, which = c("scale", "shape")) ## Not run: plot(M1JP)
Four plots (selectable by which) are currently provided:
a P-P plot, a Q-Q plot, a density plot and a return level plot.
## S3 method for class 'uvevd' plot(x, which = 1:4, main, ask = nb.fig < length(which) && dev.interactive(), ci = TRUE, cilwd = 1, a = 0, adjust = 1, jitter = FALSE, nplty = 2, ...) ## S3 method for class 'gumbelx' plot(x, interval, which = 1:4, main, ask = nb.fig < length(which) && dev.interactive(), ci = TRUE, cilwd = 1, a = 0, adjust = 1, jitter = FALSE, nplty = 2, ...)## S3 method for class 'uvevd' plot(x, which = 1:4, main, ask = nb.fig < length(which) && dev.interactive(), ci = TRUE, cilwd = 1, a = 0, adjust = 1, jitter = FALSE, nplty = 2, ...) ## S3 method for class 'gumbelx' plot(x, interval, which = 1:4, main, ask = nb.fig < length(which) && dev.interactive(), ci = TRUE, cilwd = 1, a = 0, adjust = 1, jitter = FALSE, nplty = 2, ...)
x |
An object that inherits from class |
which |
If a subset of the plots is required, specify a
subset of the numbers |
main |
Title of each plot. If given, must be a character
vector with the same length as |
ask |
Logical; if |
ci |
Logical; if |
cilwd |
Line width for confidence interval lines. |
a |
Passed through to |
adjust, jitter, nplty
|
Arguments to the density plot.
The density of the fitted model is plotted with a rug plot and
(optionally) a non-parameteric estimate. The argument
|
interval |
A vector of length two, for the gumbelx (maximum of two Gumbels) model. This is passed to the uniroot function to calculate quantiles for the Q-Q and return level plots. The interval should be large enough to contain all plotted quantiles or an error from uniroot will occur. |
... |
Other parameters to be passed through to plotting functions. |
The following discussion assumes that the fitted model is stationary. For non-stationary generalized extreme value models the data are transformed to stationarity. The plot then corresponds to the distribution obtained when all covariates are zero.
The P-P plot consists of the points
where is the empirical distribution function
(defined using ppoints), G is the model based
estimate of the distribution (generalized extreme value
or generalized Pareto), and are the data
used in the fitted model, sorted into ascending order.
The Q-Q plot consists of the points
where is the model based estimate of the quantile
function (generalized extreme value or generalized Pareto),
are plotting points defined by
ppoints, and are the data
used in the fitted model, sorted into ascending order.
The return level plot for generalized extreme value models is defined as follows.
Let be the generalized extreme value distribution
function, with location, scale and shape parameters ,
and respectively.
Let be defined by .
In common terminology, is the return level
associated with the return period .
Let .
It follows that
When , is defined by continuity, so that
The curve within the return level plot is plotted
against on a logarithmic scale, using maximum likelihood
estimates of . If the estimate of is zero, the
curve will be linear.
For large values of , is approximately equal
to the return period . It is usual practice to label the
x-axis as the return period.
The points on the plot are
where are plotting points defined by
ppoints, and are the data
used in the fitted model, sorted into ascending order.
For a good fit the points should lie “close” to the curve.
The return level plot for peaks over threshold models is defined as follows.
Let be the generalized Pareto distribution function,
with location, scale and shape parameters ,
and respectively, where is the model threshold.
Let denote the period return level
(see fpot and the notation therein).
It follows that
When , is defined by continuity, so that
The curve within the return level plot is plotted
against on a logarithmic scale, using maximum likelihood
estimates of . If the estimate of is zero,
the curve will be linear.
The points on the plot are
where are plotting points defined by
ppoints, and are the data
used in the fitted model, sorted into ascending order.
For a good fit the points should lie “close” to the curve.
plot.bvevd, density,
jitter, rug, ppoints
uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: par(mfrow = c(2,2)) ## Not run: plot(M1) uvdata <- rgpd(100, loc = 0, scale = 1.1, shape = 0.2) M1 <- fpot(uvdata, 1) ## Not run: par(mfrow = c(2,2)) ## Not run: plot(M1)uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: par(mfrow = c(2,2)) ## Not run: plot(M1) uvdata <- rgpd(100, loc = 0, scale = 1.1, shape = 0.2) M1 <- fpot(uvdata, 1) ## Not run: par(mfrow = c(2,2)) ## Not run: plot(M1)
A numeric vector containing annual maximum sea levels, in metres, from 1923 to 1987 at Port Pirie, South Australia.
portpirieportpirie
A vector containing 65 observations.
Tawn, J. A. (1993) Extreme sea-levels, in Statistics in the Environment, 243–263, eds. V. Barnett and F. Turkman, Wiley.
Coles, S. G. (2001) An Introduction to Statistical Modeling of Extreme Values. London: Springer-Verlag.
Calculate profile traces for fitted models.
## S3 method for class 'evd' profile(fitted, which = names(fitted$estimate), conf = 0.999, mesh = fitted$std.err[which]/4, xmin = rep(-Inf, length(which)), xmax = rep(Inf, length(which)), convergence = FALSE, method = "BFGS", control = list(maxit = 500), ...)## S3 method for class 'evd' profile(fitted, which = names(fitted$estimate), conf = 0.999, mesh = fitted$std.err[which]/4, xmin = rep(-Inf, length(which)), xmax = rep(Inf, length(which)), convergence = FALSE, method = "BFGS", control = list(maxit = 500), ...)
fitted |
An object of class |
which |
A character vector giving the model parameters that are to be profiled. By default, all parameters are profiled. |
conf |
Controls the range over which the parameters are profiled.
The profile trace is constructed so that (assuming the usual
asymptotic properties hold) profile confidence intervals with
confidence coefficients |
mesh |
A numeric vector containing one value for each
parameter in |
xmin, xmax
|
Numeric vectors containing one value for each
parameter in |
convergence |
Logical; print convergence code after each
optimization? (A warning is given for each non-zero convergence
code, irrespective of the value of |
method |
The optimization method. |
control |
Passed to |
... |
Ignored. |
An object of class "profile.evd", which is a list with an
element for each parameter being profiled. The elements are
matrices. The first column contains the values of the profiled
parameter. The second column contains profile deviances. The
remaining columns contain the constrained maximum likelihood
estimates for the remaining model parameters. For calculation of
profile confidence intervals, use the confint.profile.evd
function.
confint.profile.evd, profile2d.evd,
plot.profile.evd
uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: M1P <- profile(M1) ## Not run: par(mfrow = c(2,2)) ## Not run: cint <- plot(M1P) ## Not run: cintuvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: M1P <- profile(M1) ## Not run: par(mfrow = c(2,2)) ## Not run: cint <- plot(M1P) ## Not run: cint
Calculate joint profile traces for fitted models.
## S3 method for class 'evd' profile2d(fitted, prof, which, pts = 20, convergence = FALSE, method = "Nelder-Mead", control = list(maxit = 5000), ...)## S3 method for class 'evd' profile2d(fitted, prof, which, pts = 20, convergence = FALSE, method = "Nelder-Mead", control = list(maxit = 5000), ...)
fitted |
An object of class |
prof |
An object of class |
which |
A character vector of length two containing the original model parameters that are to be jointly profiled. |
pts |
The number of distinct values used for each profiled
parameter in |
convergence |
Logical; print convergence code after each
optimization? (A warning is given for each non-zero convergence
code, irrespective of the value of |
method |
The optimization method. |
control |
Passed to |
... |
Ignored. |
An object of class "profile2d.evd", which is a list with three
elements.
The first element, a matrix named trace, has the same structure
as the elements of an object of class "profile.evd".
The last two elements give the distinct values used for each profiled
parameter in which.
profile.evd, plot.profile2d.evd
uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: M1P <- profile(M1) ## Not run: M1JP <- profile2d(M1, M1P, which = c("scale", "shape")) ## Not run: plot(M1JP)uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2) M1 <- fgev(uvdata) ## Not run: M1P <- profile(M1) ## Not run: M1JP <- profile2d(M1, M1P, which = c("scale", "shape")) ## Not run: plot(M1JP)
Calculate or plot non-parametric estimates for quantile curves of bivariate extreme value distributions.
qcbvnonpar(p = seq(0.75, 0.95, 0.05), data, epmar = FALSE, nsloc1 = NULL, nsloc2 = NULL, mint = 1, method = c("cfg", "pickands", "tdo"), convex = FALSE, madj = 0, kmar = NULL, plot = FALSE, add = FALSE, lty = 1, lwd = 1, col = 1, xlim = range(data[,1], na.rm = TRUE), ylim = range(data[,2], na.rm = TRUE), xlab = colnames(data)[1], ylab = colnames(data)[2], ...)qcbvnonpar(p = seq(0.75, 0.95, 0.05), data, epmar = FALSE, nsloc1 = NULL, nsloc2 = NULL, mint = 1, method = c("cfg", "pickands", "tdo"), convex = FALSE, madj = 0, kmar = NULL, plot = FALSE, add = FALSE, lty = 1, lwd = 1, col = 1, xlim = range(data[,1], na.rm = TRUE), ylim = range(data[,2], na.rm = TRUE), xlab = colnames(data)[1], ylab = colnames(data)[2], ...)
p |
A vector of lower tail probabilities. One quantile curve is calculated or plotted for each probability. |
data |
A matrix or data frame with two columns, which may contain missing values. |
epmar |
If |
nsloc1, nsloc2
|
A data frame with the same number of rows as
|
mint |
An integer |
method, kmar
|
Arguments for the non-parametric estimate of the
dependence function. See |
convex, madj
|
Other arguments for the non-parametric estimate of the dependence function. |
plot |
Logical; if |
add |
Logical; add quantile curves to an existing data plot?
The existing plot should have been created using either
|
lty, lwd
|
Line types and widths. |
col |
Line colour. |
xlim, ylim
|
x and y-axis limits. |
xlab, ylab
|
x and y-axis labels. |
... |
Other high-level graphics parameters to be passed to
|
Let G be a fitted bivariate distribution function with
margins and . A quantile curve for a fitted
distribution function G at lower tail probability p is defined
by
For bivariate extreme value distributions, it consists of the points
where and ,
with being the estimated dependence function defined
in abvevd, and where lies in the interval
.
By default the margins and are modelled using
estimated generalized extreme value distributions.
For non-stationary generalized extreme value margins the plotted
data are transformed to stationarity, and the plot corresponds
to the distribution obtained when all covariates are zero.
If epmar is TRUE, empirical transformations
are used in preference to generalized extreme value models.
Note that the marginal empirical quantile functions are
evaluated using quantile, which linearly
interpolates between data points, hence the curve will not
be a step function.
The idea behind the argument is that if
G is fitted to a dataset of componentwise maxima, and the
underlying observations are iid distributed according
to F, then if is the size of the blocks over which the
maxima were taken, approximately , leading
to .
qcbvnonpar calculates or plots non-parametric quantile
curve estimates for bivariate extreme value distributions.
If p has length one it returns a two column matrix
giving points on the curve, else it returns a list of
such matrices.
bvdata <- rbvevd(100, dep = 0.7, model = "log") qcbvnonpar(c(0.9,0.95), data = bvdata, plot = TRUE) qcbvnonpar(c(0.9,0.95), data = bvdata, epmar = TRUE, plot = TRUE)bvdata <- rbvevd(100, dep = 0.7, model = "log") qcbvnonpar(c(0.9,0.95), data = bvdata, plot = TRUE) qcbvnonpar(c(0.9,0.95), data = bvdata, epmar = TRUE, plot = TRUE)
Density function, distribution function, quantile function and random generation for the reverse (or negative) Weibull distribution with location, scale and shape parameters.
drweibull(x, loc=0, scale=1, shape=1, log = FALSE) prweibull(q, loc=0, scale=1, shape=1, lower.tail = TRUE) qrweibull(p, loc=0, scale=1, shape=1, lower.tail = TRUE) rrweibull(n, loc=0, scale=1, shape=1) dnweibull(x, loc=0, scale=1, shape=1, log = FALSE) pnweibull(q, loc=0, scale=1, shape=1, lower.tail = TRUE) qnweibull(p, loc=0, scale=1, shape=1, lower.tail = TRUE) rnweibull(n, loc=0, scale=1, shape=1)drweibull(x, loc=0, scale=1, shape=1, log = FALSE) prweibull(q, loc=0, scale=1, shape=1, lower.tail = TRUE) qrweibull(p, loc=0, scale=1, shape=1, lower.tail = TRUE) rrweibull(n, loc=0, scale=1, shape=1) dnweibull(x, loc=0, scale=1, shape=1, log = FALSE) pnweibull(q, loc=0, scale=1, shape=1, lower.tail = TRUE) qnweibull(p, loc=0, scale=1, shape=1, lower.tail = TRUE) rnweibull(n, loc=0, scale=1, shape=1)
x, q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
Number of observations. |
loc, scale, shape
|
Location, scale and shape parameters (can be given as vectors). |
log |
Logical; if |
lower.tail |
Logical; if |
The reverse (or negative) Weibull distribution function with parameters
, and
is
for and one otherwise, where and
.
drweibull and dnweibull give the density function,
prweibull and pnweibull give the distribution function,
qrweibull and qnweibull give the quantile function,
rrweibull and rnweibull generate random deviates.
Within extreme value theory the reverse Weibull distibution (also known as the negative Weibull distribution) is often referred to as the Weibull distribution. We make a distinction to avoid confusion with the three-parameter distribution used in survival analysis, which is related by a change of sign to the distribution given above.
drweibull(-5:-3, -1, 0.5, 0.8) prweibull(-5:-3, -1, 0.5, 0.8) qrweibull(seq(0.9, 0.6, -0.1), 2, 0.5, 0.8) rrweibull(6, -1, 0.5, 0.8) p <- (1:9)/10 prweibull(qrweibull(p, -1, 2, 0.8), -1, 2, 0.8) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9drweibull(-5:-3, -1, 0.5, 0.8) prweibull(-5:-3, -1, 0.5, 0.8) qrweibull(seq(0.9, 0.6, -0.1), 2, 0.5, 0.8) rrweibull(6, -1, 0.5, 0.8) p <- (1:9)/10 prweibull(qrweibull(p, -1, 2, 0.8), -1, 2, 0.8) ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
A numeric vector containing maximum annual flood discharges, in units of 1000 cubic feet per second, of the North Saskachevan River at Edmonton, over a period of 47 years. Unfortunately, the data are ordered from largest to smallest.
sasksask
A vector containing 47 observations.
van Montfort, M. A. J. (1970) On testing that the distribution is of type I when type II is the alternative. J. Hydrology, 11, 421–427.
The sealevel data frame has 81 rows and 2 columns.
The columns contain annual sea level maxima from 1912 to 1992 at
Dover and Harwich respectively, two sites on the coast of Britain.
The row names give the years of observation.
There are 39 missing values.
sealevelsealevel
This data frame contains the following columns:
A numeric vector containing annual sea level maxima at Dover, including 9 missing values.
A numeric vector containing sea annual level maxima at Harwich, including 30 missing values.
Coles, S. G. and Tawn, J. A. (1990) Statistics of coastal flood prevention. Phil. Trans. R. Soc. Lond., A 332, 457–476.
The sealevel2 data frame has 81 rows and 3 columns.
The first two columns contain annual sea level maxima from 1912
to 1992 at Dover and Harwich respectively, two sites on the coast
of Britain.
The third column is a logical vector denoting whether or not the
maxima in a given year are assumed to have derived from the
same storm event; this assumption is made if the times of
obsevation of the maxima are at most 48 hours apart.
The row names give the years of observation.
There are 39 missing data values.
There are only nine non-missing logical values.
sealevel2sealevel2
This data frame contains the following columns:
A numeric vector containing annual sea level maxima at Dover, including 9 missing values.
A numeric vector containing sea annual level maxima at Harwich, including 30 missing values.
A logical vector denoting whether or not the maxima are assumed to have derived from the same storm event.
Coles, S. G. and Tawn, J. A. (1990) Statistics of coastal flood prevention. Phil. Trans. R. Soc. Lond., A 332, 457–476.
Plots of parameter estimates at various thresholds for peaks over threshold modelling, using the Generalized Pareto or Point Process representation.
tcplot(data, tlim, model = c("gpd","pp"), pscale = FALSE, cmax = FALSE, r = 1, ulow = -Inf, rlow = 1, nt = 25, which = 1:npar, conf = 0.95, lty = 1, lwd = 1, type = "b", cilty = 1, vci = TRUE, xlab, xlim, ylabs, ylims, ask = nb.fig < length(which) && dev.interactive(), ...)tcplot(data, tlim, model = c("gpd","pp"), pscale = FALSE, cmax = FALSE, r = 1, ulow = -Inf, rlow = 1, nt = 25, which = 1:npar, conf = 0.95, lty = 1, lwd = 1, type = "b", cilty = 1, vci = TRUE, xlab, xlim, ylabs, ylims, ask = nb.fig < length(which) && dev.interactive(), ...)
data |
A numeric vector. |
tlim |
A numeric vector of length two, giving the limits for the thresholds at which the model is fitted. |
model |
The model; either |
pscale |
If |
cmax |
Logical; if |
r, ulow, rlow
|
Arguments used for the identification of
clusters of exceedences (see |
nt |
The number of thresholds at which the model is fitted. |
which |
If a subset of the plots is required, specify a
subset of the numbers |
conf |
The (pointwise) confidence coefficient for the plotted confidence intervals. Use zero to suppress. |
lty, lwd
|
The line type and width of the line connecting the parameter estimates. |
type |
The form taken by the line connecting the parameter
estimates and the points denoting these estimates. Possible
values include |
cilty |
The line type of the lines depicting the confidence intervals. |
vci |
If |
xlab, xlim
|
Label and limits for the x-axis; if given, these arguments apply to every plot. |
ylabs, ylims
|
A vector of y-axis labels and a matrix of
y-axis limits. If given, |
ask |
Logical; if |
... |
Other arguments to be passed to the model fit
function |
For each of the nt thresholds a peaks over threshold model
is fitted using the function fpot. When model = "gpd"
(the default), the maximum likelihood estimates for the shape and the
modified scale parameter (modified by subtracting the shape multiplied
by the threshold) are plotted against the thresholds.
When model = "pp" the maximum likelihood estimates for the
location, scale and shape parameters are plotted against the
thresholds. (The modified scale parameter in the "gpd" case
is equivalent to the scale parameter in the "pp" case.)
If the threshold u is a valid threshold to be used for peaks
over threshold modelling, the parameter estimates depicted should
be approximately constant above u.
A list is invisibly returned. Each component is a matrix with three columns giving parameter estimates and confidence limits.
Stuart Coles and Alec Stephenson
tlim <- c(3.6, 4.2) ## Not run: tcplot(portpirie, tlim) ## Not run: tcplot(portpirie, tlim, nt = 100, lwd = 3, type = "l") ## Not run: tcplot(portpirie, tlim, model = "pp")tlim <- c(3.6, 4.2) ## Not run: tcplot(portpirie, tlim) ## Not run: tcplot(portpirie, tlim, nt = 100, lwd = 3, type = "l") ## Not run: tcplot(portpirie, tlim, model = "pp")
The uccle data frame has 35 rows and 4 columns.
The columns contain annual rainfall maxima (in millimetres) from
1938 to 1972 at Uccle, Belgium, over the durations of one day,
one hour, ten minutes and one minute.
The row names give the years of observation.
uccleuccle
This data frame contains the following columns:
Annual daily rainfall maxima.
Annual hourly rainfall maxima.
Annual rainfall maxima over ten minute durations.
Annual rainfall maxima over one minute durations.
Sneyers, R. (1977) L'intensite maximale des precipitations en Belgique. Inst. Royal Meteor. Belgique, B 86.
The venice data frame has 51 rows and 10 columns.
The jth column contains the jth largest sea levels in Venice,
for the years 1931–1981.
Only the largest six measurements are available for the year 1935;
the corresponding row contains four missing values.
The years for each set of measurements are given as row names.
A larger version of this data is available in the dataset
venice2.
venicevenice
A data frame with 51 rows and 10 columns.
Smith, R. L. (1986)
Extreme value theory based on the largest annual events.
Journal of Hydrology, 86, 27–43.
Coles, S. G. (2001) An Introduction to Statistical Modeling of Extreme Values. London: Springer-Verlag.
The venice2 data frame has 125 rows and 10 columns.
The data was kindly provided by Anthony Davison.
The jth column contains the jth largest sea levels in Venice,
for the years 1887–2011.
This is a larger version of the dataset venice.
Only the largest six measurements are available for the year 1935,
and only the largest is available for 1922; the corresponding rows
contain missing values.
The years for each set of measurements are given as row names.
venice2venice2
A data frame with 125 rows and 10 columns.
Smith, R. L. (1986)
Extreme value theory based on the largest annual events.
Journal of Hydrology, 86, 27–43.
Coles, S. G. (2001) An Introduction to Statistical Modeling of Extreme Values. London: Springer-Verlag.