Title: | Smooth Estimation of GPD Shape Parameter |
---|---|
Description: | Given independent and identically distributed observations X(1), ..., X(n) from a Generalized Pareto distribution with shape parameter gamma in [-1,0], offers several estimates to compute estimates of gamma. The estimates are based on the principle of replacing the order statistics by quantiles of a distribution function based on a log--concave density function. This procedure is justified by the fact that the GPD density is log--concave for gamma in [-1,0]. |
Authors: | Kaspar Ru{f}{i}bach <[email protected]> and Samuel Mueller <[email protected]> |
Maintainer: | Kaspar Rufibach <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.0.5 |
Built: | 2024-12-31 03:15:37 UTC |
Source: | https://github.com/cran/smoothtail |
Given independent and identically distributed observations from a
Generalized Pareto distribution with shape parameter
, offers three
methods to compute estimates of
. The estimates are based on the principle of replacing the order
statistics
of the sample by quantiles
of the distribution function
based on the
log–concave density estimator
. This procedure is justified by the fact that the GPD density is
log–concave for
.
Package: | smoothtail |
Type: | Package |
Version: | 2.0.5 |
Date: | 2016-07-12 |
License: | GPL (>=2) |
Use this package to estimate the shape parameter of a Generalized Pareto Distribution (GPD). In
extreme value theory,
is denoted tail index. We offer three new estimators, all based on the fact
that the density function of the GPD is log–concave if
, see Mueller and Rufibach (2009).
The functions for estimation of the tail index are:
pickands
falk
falkMVUE
generalizedPick
This package depends on the package logcondens for estimation of a log–concave density: all the above functions take as first argument a dlc
object as generated by logConDens
in logcondens.
Additionally, functions for density, distribution function, quantile function and random number generation for
a GPD with location parameter 0, shape parameter and scale parameter
are provided:
Let us shortly clarify what we mean with log–concave density estimation. Suppose we are given an ordered sample
of i.i.d. random variables having density function
, where
for a concave function
. Following the development in
Duembgen and Rufibach (2009), it is then possible to get an estimator
of
via the maximizer
of
over all concave functions . It turns out that
is piecewise linear, with
knots only at (some of the) observation points. Therefore, the infinite-dimensional optimization problem of finding
the function
boils down to a finite dimensional problem of finding the vector
.
How to solve this problem is
described in Rufibach (2006, 2007) and in a more general setting in Duembgen, Huesler, and Rufibach (2010). The distribution function based on
is defined as
for a real number. The definition of
is justified by the fact that
.
Kaspar Rufibach (maintainer), [email protected] ,
http://www.kasparrufibach.ch
Samuel Mueller, [email protected],
www.maths.usyd.edu.au/ut/people?who=S_Mueller
Kaspar Rufibach acknowledges support by the Swiss National Science Foundation SNF, http://www.snf.ch
Duembgen, L. and Rufibach, K. (2009) Maximum likelihood estimation of a log–concave density and its distribution function: basic properties and uniform consistency. Bernoulli, 15(1), 40–68.
Duembgen, L., Huesler, A. and Rufibach, K. (2010) Active set and EM algorithms for log-concave densities based on complete and censored data. Technical report 61, IMSV, Univ. of Bern, available at http://arxiv.org/abs/0707.4643.
Mueller, S. and Rufibach K. (2009). Smooth tail index estimation. J. Stat. Comput. Simul., 79, 1155–1167.
Mueller, S. and Rufibach K. (2008). On the max–domain of attraction of distributions with log–concave densities. Statist. Probab. Lett., 78, 1440–1444.
Rufibach K. (2006) Log-concave Density Estimation and Bump Hunting for i.i.d. Observations.
PhD Thesis, University of Bern, Switzerland and Georg-August University of Goettingen, Germany, 2006.
Available at http://www.zb.unibe.ch/download/eldiss/06rufibach_k.pdf.
Rufibach, K. (2007) Computing maximum likelihood estimators of a log-concave density function. J. Stat. Comput. Simul., 77, 561–574.
Package logcondens.
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) # compute known endpoint omega <- -1 / gam # estimate log-concave density, i.e. generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # plot distribution functions s <- seq(0.01, max(x), by = 0.01) plot(0, 0, type = 'n', ylim = c(0, 1), xlim = range(c(x, s))); rug(x) lines(s, pgpd(s, gam), type = 'l', col = 2) lines(x, 1:n / n, type = 's', col = 3) lines(x, est$Fhat, type = 'l', col = 4) legend(1, 0.4, c('true', 'empirical', 'estimated'), col = c(2 : 4), lty = 1) # compute tail index estimators for all sensible indices k falk.logcon <- falk(est) falkMVUE.logcon <- falkMVUE(est, omega) pick.logcon <- pickands(est) genPick.logcon <- generalizedPick(est, c = 0.75, gam0 = -1/3) # plot smoothed and unsmoothed estimators versus number of order statistics plot(0, 0, type = 'n', xlim = c(0,n), ylim = c(-1, 0.2)) lines(1:n, pick.logcon[, 2], col = 1); lines(1:n, pick.logcon[, 3], col = 1, lty = 2) lines(1:n, falk.logcon[, 2], col = 2); lines(1:n, falk.logcon[, 3], col = 2, lty = 2) lines(1:n, falkMVUE.logcon[,2], col = 3); lines(1:n, falkMVUE.logcon[,3], col = 3, lty = 2) lines(1:n, genPick.logcon[, 2], col = 4); lines(1:n, genPick.logcon[, 3], col = 4, lty = 2) abline(h = gam, lty = 3) legend(11, 0.2, c("Pickands", "Falk", "Falk MVUE", "Generalized Pickands'"), lty = 1, col = 1:8)
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) # compute known endpoint omega <- -1 / gam # estimate log-concave density, i.e. generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # plot distribution functions s <- seq(0.01, max(x), by = 0.01) plot(0, 0, type = 'n', ylim = c(0, 1), xlim = range(c(x, s))); rug(x) lines(s, pgpd(s, gam), type = 'l', col = 2) lines(x, 1:n / n, type = 's', col = 3) lines(x, est$Fhat, type = 'l', col = 4) legend(1, 0.4, c('true', 'empirical', 'estimated'), col = c(2 : 4), lty = 1) # compute tail index estimators for all sensible indices k falk.logcon <- falk(est) falkMVUE.logcon <- falkMVUE(est, omega) pick.logcon <- pickands(est) genPick.logcon <- generalizedPick(est, c = 0.75, gam0 = -1/3) # plot smoothed and unsmoothed estimators versus number of order statistics plot(0, 0, type = 'n', xlim = c(0,n), ylim = c(-1, 0.2)) lines(1:n, pick.logcon[, 2], col = 1); lines(1:n, pick.logcon[, 3], col = 1, lty = 2) lines(1:n, falk.logcon[, 2], col = 2); lines(1:n, falk.logcon[, 3], col = 2, lty = 2) lines(1:n, falkMVUE.logcon[,2], col = 3); lines(1:n, falkMVUE.logcon[,3], col = 3, lty = 2) lines(1:n, genPick.logcon[, 2], col = 4); lines(1:n, genPick.logcon[, 3], col = 4, lty = 2) abline(h = gam, lty = 3) legend(11, 0.2, c("Pickands", "Falk", "Falk MVUE", "Generalized Pickands'"), lty = 1, col = 1:8)
Given an ordered sample of either exceedances or upper order statistics which is to be modeled using a GPD, this
function provides Falk's estimator of the shape parameter . Precisely,
for $H$ either the empirical or the distribution function based on the log–concave density estimator.
Note that for any ,
. If
, then it is likely that the log-concavity assumption is violated.
falk(est, ks = NA)
falk(est, ks = NA)
est |
Log-concave density estimate based on the sample as output by |
ks |
Indices |
n x 3 matrix with columns: indices , Falk's estimator based on the log-concave density estimate, and
the ordinary Falk's estimator based on the order statistics.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Samuel Mueller, [email protected],
www.maths.usyd.edu.au/ut/people?who=S_Mueller
Kaspar Rufibach acknowledges support by the Swiss National Science Foundation SNF, http://www.snf.ch
Mueller, S. and Rufibach K. (2009). Smooth tail index estimation. J. Stat. Comput. Simul., 79, 1155–1167.
Falk, M. (1995). Some best parameter estimates for distributions with finite endpoint. Statistics, 27, 115–125.
Other approaches to estimate based on the fact that the density is log–concave, thus
, are available as the functions
pickands
, falkMVUE
, generalizedPick
.
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) ## generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # compute tail index estimator falk(est)
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) ## generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # compute tail index estimator falk(est)
Given an ordered sample of either exceedances or upper order statistics which is to be modeled using a GPD with
distribution function , this function provides Falk's estimator of the shape parameter
if the endpoint
of is known. Precisely,
for either the empirical or the distribution function based on the log–concave density estimator.
Note that for any
,
. If
, then it is likely that the log-concavity assumption is violated.
falkMVUE(est, omega, ks = NA)
falkMVUE(est, omega, ks = NA)
est |
Log-concave density estimate based on the sample as output by |
omega |
Known endpoint. Make sure that |
ks |
Indices |
n x 3 matrix with columns: indices , Falk's MVUE estimator using the log-concave density estimate, and
the ordinary Falk MVUE estimator based on the order statistics.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Samuel Mueller, [email protected],
www.maths.usyd.edu.au/ut/people?who=S_Mueller
Kaspar Rufibach acknowledges support by the Swiss National Science Foundation SNF, http://www.snf.ch
Mueller, S. and Rufibach K. (2009). Smooth tail index estimation. J. Stat. Comput. Simul., 79, 1155–1167.
Falk, M. (1994).
Extreme quantile estimation in -neighborhoods of generalized Pareto distributions.
Statistics and Probability Letters, 20, 9–21.
Falk, M. (1995). Some best parameter estimates for distributions with finite endpoint. Statistics, 27, 115–125.
Other approaches to estimate based on the fact that the density is log–concave, thus
, are available as the functions
pickands
, falk
, generalizedPick
.
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) ## generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # compute tail index estimators omega <- -1 / gam falkMVUE(est, omega)
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) ## generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # compute tail index estimators omega <- -1 / gam falkMVUE(est, omega)
Given an ordered sample of either exceedances or upper order statistics which is to be modeled using a GPD with
distribution function , this function provides Segers' estimator of the shape parameter
,
see Segers (2005). Precisely, for
, the estimator can be written as
for either the empirical or the distribution function based on the log–concave density estimator
and
the mixing measure given in Segers (2005), Theorem 4.1, (i).
Note that for any
,
.
If
, then it is likely that the log-concavity assumption is violated.
generalizedPick(est, c, gam0, ks = NA)
generalizedPick(est, c, gam0, ks = NA)
est |
Log-concave density estimate based on the sample as output by |
c |
Number in |
gam0 |
Number in |
ks |
Indices |
n x 3 matrix with columns: indices , Segers' estimator using the smoothing method, and
the ordinary Segers' estimator based on the order statistics.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Samuel Mueller, [email protected],
www.maths.usyd.edu.au/ut/people?who=S_Mueller
Kaspar Rufibach acknowledges support by the Swiss National Science Foundation SNF, http://www.snf.ch
Mueller, S. and Rufibach K. (2009). Smooth tail index estimation. J. Stat. Comput. Simul., 79, 1155–1167.
Segers, J. (2005). Generalized Pickands estimators for the extreme value index. J. Statist. Plann. Inference, 128, 381–396.
Other approaches to estimate based on the fact that the density is log–concave, thus
, are available as the functions
pickands
, falk
, falkMVUE
.
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) ## generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # compute tail index estimators generalizedPick(est, c = 0.75, gam0 = -1/3)
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) ## generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # compute tail index estimators generalizedPick(est, c = 0.75, gam0 = -1/3)
Density function, distribution function, quantile function and
random generation for the generalized Pareto distribution (GPD) with shape parameter and
scale parameter
.
dgpd(x, gam, sigma = 1) pgpd(q, gam, sigma = 1) qgpd(p, gam, sigma = 1) rgpd(n, gam, sigma = 1)
dgpd(x, gam, sigma = 1) pgpd(q, gam, sigma = 1) qgpd(p, gam, sigma = 1) rgpd(n, gam, sigma = 1)
x , q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
Number of observations. |
gam |
Shape parameter, real number. |
sigma |
Scale parameter, positive real number. |
The generalized Pareto distribution function (Pickands, 1975) with
shape parameter and scale parameter
is
If , the distribution function is defined by continuity. The density is denoted by
.
dgpd
gives the values of the density function, pgpd
those of the distribution
function, and qgpd
those of the quantile function of the GPD at and
,
respectively.
rgpd
generates random numbers, returned as an ordered vector.
Kaspar Rufibach, [email protected],
http://www.kasparrufibach.ch
Samuel Mueller, [email protected],
www.maths.usyd.edu.au/ut/people?who=S_Mueller
Pickands, J. (1975). Statistical inference using extreme order statistics. Annals of Statistics, 3, 119-131.
Similar functions are provided in the R-packages evir and evd.
This function computes
given in Theorem 4.1 of Segers (2005) and is called by generalizedPick
.
It is not intended to be called by the user.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Samuel Mueller, [email protected],
www.maths.usyd.edu.au/ut/people?who=S_Mueller
Kaspar Rufibach acknowledges support by the Swiss National Science Foundation SNF, http://www.snf.ch
Mueller, S. and Rufibach K. (2009). Smooth tail index estimation. J. Stat. Comput. Simul., 79, 1155–1167.
Segers, J. (2005). Generalized Pickands estimators for the extreme value index. J. Statist. Plann. Inference, 128, 381–396.
Called by generalizedPick
.
Given an ordered sample of either exceedances or upper order statistics which is to be modeled using a GPD, this
function provides Pickands' estimator of the shape parameter .
Precisely, for
for $H$ either the empirical or the distribution function based on the log–concave density
estimator and
if is the empirical distribution function and
if .
pickands(est, ks = NA)
pickands(est, ks = NA)
est |
Log-concave density estimate based on the sample as output by |
ks |
Indices |
n x 3 matrix with columns: indices , Pickands' estimator using the log-concave density estimate, and
the ordinary Pickands' estimator based on the order statistics.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Samuel Mueller, [email protected],
www.maths.usyd.edu.au/ut/people?who=S_Mueller
Kaspar Rufibach acknowledges support by the Swiss National Science Foundation SNF, http://www.snf.ch
Mueller, S. and Rufibach K. (2009). Smooth tail index estimation. J. Stat. Comput. Simul., 79, 1155–1167.
Pickands, J. (1975). Statistical inference using extreme order statistics. Annals of Statistics 3, 119–131.
Other approaches to estimate based on the fact that the density is log–concave, thus
, are available as the functions
falk
, falkMVUE
, generalizedPick
.
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) ## generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # compute tail index estimators pickands(est)
# generate ordered random sample from GPD set.seed(1977) n <- 20 gam <- -0.75 x <- rgpd(n, gam) ## generate dlc object est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL) # compute tail index estimators pickands(est)