Title: | Estimate a Log-Concave Probability Mass Function from Discrete i.i.d. Observations |
---|---|
Description: | Given independent and identically distributed observations X(1), ..., X(n), allows to compute the maximum likelihood estimator (MLE) of probability mass function (pmf) under the assumption that it is log-concave, see Weyermann (2007) and Balabdaoui, Jankowski, Rufibach, and Pavlides (2012). The main functions of the package are 'logConDiscrMLE' that allows computation of the log-concave MLE, 'logConDiscrCI' that computes pointwise confidence bands for the MLE, and 'kInflatedLogConDiscr' that computes a mixture of a log-concave PMF and a point mass at k. |
Authors: | Kaspar Rufibach <[email protected]> and Fadoua Balabdaoui <[email protected]> and Hanna Jankowski <[email protected]> and Kathrin Weyermann <[email protected]> |
Maintainer: | Kaspar Rufibach <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.6 |
Built: | 2025-01-27 03:39:29 UTC |
Source: | https://github.com/cran/logcondiscr |
Implements the maximum likelihood estimator (MLE) for a probability mass function (PMF) under the assumption of log-concavity from i.i.d. data.
Package: | logcondiscr |
Type: | Package |
Version: | 1.0.6 |
Date: | 2015-07-03 |
License: | GPL (>=2) |
LazyLoad: | yes |
The main functions in the package are:
logConDiscrMLE
: Compute the maximum likelihood estimator (MLE) of a log-concave PMF from i.i.d. data. The constrained log-likelihood function is maximized using an active set algorithm as initially described in Weyermann (2007).
logConDiscrCI
: Compute the maximum likelihood estimator (MLE) of a log-concave PMF from i.i.d. data and corresponding, asymptotically valid, pointwise confidence bands as
developed in Balabdaoui et al (2012).
kInflatedLogConDiscr
: Compute an estimate of a mixture of a log-concave PMF that is inflated at , from i.i.d. data, using an EM algorithm.
Kaspar Rufibach (maintainer) [email protected]
http://www.kasparrufibach.ch
Fadoua Balabdaoui [email protected]
http://www.ceremade.dauphine.fr/~fadoua
Hanna Jankowski [email protected]
http://www.math.yorku.ca/~hkj
Kathrin Weyermann
Balabdaoui, F., Jankowski, H., Rufibach, K., and Pavlides, M. (2013). Maximum likelihood estimation and confidence bands for a discrete log-concave distribution. J. R. Stat. Soc. Ser. B Stat. Methodol., 75(4), 769–790.
Weyermann, K. (2007). An Active Set Algorithm for Log-Concave Discrete Distributions. MSc thesis, University of Bern (Supervisor: Lutz Duembgen).
Functions to estimate the log-concave MLE for a univariate continuous distribution are provided in the package logcondens and for observations in more than one dimension in LogConDEAD.
## see the help files for the abovementioned functions for examples
## see the help files for the abovementioned functions for examples
Internal functions for the estimation of a log-concave probability mass function. These functions are not intended to be called by the user directly.
Direction
Compute vector that points in direction of via Newton step.
dMLE
Compute the vector s.t. the log-likelihood function
, as implemented in
LikFunk
, is maximized
over all PMFs (under no additional restrictions, though).
GradientL
Gradient of LikFunk
.
HesseL
Hesse matrix of LikFunk
.
J00
Function introduced in Section 2.3 in Weyermann (2007), defined as
This function is used to compute the value of the log-likelihood in LikFunk
.
J10
Derivative of w.r.t to the first argument.
J11
Derivative of w.r.t to both arguments.
J20
Second derivative of w.r.t to the first argument.
LikFunk
The log-likelihood function for the discrete log-concave MLE.
LocalCoarsen
Auxiliary function.
LocalConcavity
Auxiliary function.
LocalExtend
Auxiliary function.
LocalMLE
Auxiliary function.
LocalNormalize
Auxiliary function.
StepSize
Auxiliary function.
Kaspar Rufibach (maintainer) [email protected]
http://www.kasparrufibach.ch
Fadoua Balabdaoui [email protected]
http://www.ceremade.dauphine.fr/~fadoua
Hanna Jankowski [email protected]
http://www.math.yorku.ca/~hkj
Kathrin Weyermann
Balabdaoui, F., Jankowski, H., Rufibach, K., and Pavlides, M. (2013). Maximum likelihood estimation and confidence bands for a discrete log-concave distribution. J. R. Stat. Soc. Ser. B Stat. Methodol., 75(4), 769–790.
Weyermann, K. (2007). An Active Set Algorithm for Log-Concave Discrete Distributions. MSc thesis, University of Bern (Supervisor: Lutz Duembgen).
All these functions are used by the function logConDiscrMLE
.
Using an EM algorithm, compute an estimate of a mixture of a point mass at and a
log-concave probability mass function from discrete i.i.d. data.
kInflatedLogConDiscr(x, k = 0, prec1 = 1e-10, prec2 = 1e-15, itermax = 200, output = TRUE, theta0 = 0.5, p0 = NA)
kInflatedLogConDiscr(x, k = 0, prec1 = 1e-10, prec2 = 1e-15, itermax = 200, output = TRUE, theta0 = 0.5, p0 = NA)
x |
Vector of observations. |
k |
Point at which inflation should be assumed. Must be in |
prec1 |
Precision for stopping criterion. |
prec2 |
Precision to remove ends of support in case weights |
itermax |
Maximal number of iterations of EM algorithm. |
output |
Logical, if |
theta0 |
Optional initialization for |
p0 |
Optional initialization for the PMF. |
Given a vector of observations from a discrete PMF with a (potential)
point mass at
(typically
),
kInflatedLogConDiscr
computes a pmf that is a mixture between a point mass at and a log-concave PMF on
.
To accomplish this, an EM algorithm is used.
A list containing the following elementes:
z |
The support. |
f |
The estimated |
E(L) |
The value of the expected composite log-likelihood at the maximum. |
loglik |
The value of the composite log-likelihood at the maximum. |
theta |
The estimated weight at |
logconc.pmf |
The log-concave part of the mixture. |
logconc.z |
The support of |
Kaspar Rufibach (maintainer) [email protected]
http://www.kasparrufibach.ch
Fadoua Balabdaoui [email protected]
http://www.ceremade.dauphine.fr/~fadoua
Hanna Jankowski [email protected]
http://www.math.yorku.ca/~hkj
Kathrin Weyermann
Balabdaoui, F., Jankowski, H., Rufibach, K., and Pavlides, M. (2013). Maximum likelihood estimation and confidence bands for a discrete log-concave distribution. J. R. Stat. Soc. Ser. B Stat. Methodol., 75(4), 769–790.
Weyermann, K. (2007). An Active Set Algorithm for Log-Concave Discrete Distributions. MSc thesis, University of Bern (Supervisor: Lutz Duembgen).
## ----------------------------------------------- ## generate zero-inflated negative binomial sample ## ----------------------------------------------- set.seed(2011) n <- 100 theta <- 0.05 r <- 6 p <- 0.3 x <- rnbinom(n, r, p) ## inflate at 0 x <- ifelse(runif(n) <= theta, 0, x) ## estimate log-concave MLE fit1 <- logConDiscrMLE(x, w = NA, psi_o = NA, prec = 1e-05, output = TRUE) ## estimate zero-inflated log-concave MLE fit2 <- kInflatedLogConDiscr(x, k = 0) ## plot the results par(mfrow = c(1, 1), las = 1) plot(fit1$x, exp(fit1$psi), type = "b", col = 2, xlim = range(x), xlab = "x", ylim = c(0, max(exp(fit1$psi), fit2$f)), ylab = "PMF", main = "Estimate MLE from a zero-inflated negative-binomial", pch = 19) lines(fit2$z, fit2$f, type = "b", col = 4, pch = 15) ## add the true PMF we sampled from z <- fit2$z f.true <- theta * c(1, rep(0, length(z) - 1)) + (1 - theta) * dnbinom(z, r, p) lines(z, f.true, col = 6, type = "b", pch = 17) legend("topright", c("log-concave MLE", "zero-inflated log-concave MLE", "true PMF"), col = c(2, 4, 6), lty = c(1, 1, 1), pch = c(19, 15, 17), bty = "n") ## Not run: ## ----------------------------------------------- ## generate seven-inflated negative binomial sample ## ----------------------------------------------- theta <- 0.05 r <- 4 p <- 0.3 n <- 10000 x <- rnbinom(n, r, p) x <- ifelse(runif(n) <= theta, 7, x) x <- c(x, rep(7, 10)) ## compute different estimates zero.mle <- kInflatedLogConDiscr(x, k = 7) mle <- logConDiscrMLE(x, output = FALSE) f.mle <- exp(mle$psiSupp) z<- zero.mle$z f1 <- theta * c(rep(0, 7 - min(x)), 1, rep(0, max(x) - 7)) f2 <- (1 - theta) * dnbinom(z, r, p) f.true <- f1 + f2 true <- dnbinom(z, r, p) f.fit <- zero.mle$f xx <- sort(unique(x)) emp <- rep(0, length(z)) emp[xx - min(x) + 1] <- as.vector(table(x) / n) ## plot results plot(z, f.true, type = "l", ylim = c(0, max(emp)), xlab = "x", ylab = "PMF", main = "Illustration k-inflated estimator") points(z, true, type = "l", lty = 2) points(z, f.fit, type = "l", col = "red") points(z, zero.mle$logconc.pmf, type = "l", col = "red", lty = 2) points(min(x):max(x), f.mle, type = "l", col = "green") points(z, emp, type = "l", col = "purple") points(z, emp, col = "purple") legend("topright", inset = 0.05, c("true", "true less seven", "seven-inflated", "recovered", "logconc", "empirical"), lty=c(1, 2, 1, 2, 1, 1), col = c("black", "black", "red", "red", "green", "purple")) ## zoom in near mode subs <- 4:12 plot(z[subs], f.true[subs], type = "l", ylim = c(0, max(emp)), xlab = "x", ylab = "PMF", main = "Illustration k-inflated estimator") points(z[subs], true[subs], type = "l", lty = 2) points(z[subs], f.fit[subs], type = "l", col = "red") points(z[subs], zero.mle$logconc.pmf[subs], type = "l", col = "red", lty = 2) points((min(x):max(x))[subs], f.mle[subs], type = "l", col = "green") points(z[subs], emp[subs], type = "l", col = "purple") points(z[subs], emp[subs], col = "purple") ## End(Not run)
## ----------------------------------------------- ## generate zero-inflated negative binomial sample ## ----------------------------------------------- set.seed(2011) n <- 100 theta <- 0.05 r <- 6 p <- 0.3 x <- rnbinom(n, r, p) ## inflate at 0 x <- ifelse(runif(n) <= theta, 0, x) ## estimate log-concave MLE fit1 <- logConDiscrMLE(x, w = NA, psi_o = NA, prec = 1e-05, output = TRUE) ## estimate zero-inflated log-concave MLE fit2 <- kInflatedLogConDiscr(x, k = 0) ## plot the results par(mfrow = c(1, 1), las = 1) plot(fit1$x, exp(fit1$psi), type = "b", col = 2, xlim = range(x), xlab = "x", ylim = c(0, max(exp(fit1$psi), fit2$f)), ylab = "PMF", main = "Estimate MLE from a zero-inflated negative-binomial", pch = 19) lines(fit2$z, fit2$f, type = "b", col = 4, pch = 15) ## add the true PMF we sampled from z <- fit2$z f.true <- theta * c(1, rep(0, length(z) - 1)) + (1 - theta) * dnbinom(z, r, p) lines(z, f.true, col = 6, type = "b", pch = 17) legend("topright", c("log-concave MLE", "zero-inflated log-concave MLE", "true PMF"), col = c(2, 4, 6), lty = c(1, 1, 1), pch = c(19, 15, 17), bty = "n") ## Not run: ## ----------------------------------------------- ## generate seven-inflated negative binomial sample ## ----------------------------------------------- theta <- 0.05 r <- 4 p <- 0.3 n <- 10000 x <- rnbinom(n, r, p) x <- ifelse(runif(n) <= theta, 7, x) x <- c(x, rep(7, 10)) ## compute different estimates zero.mle <- kInflatedLogConDiscr(x, k = 7) mle <- logConDiscrMLE(x, output = FALSE) f.mle <- exp(mle$psiSupp) z<- zero.mle$z f1 <- theta * c(rep(0, 7 - min(x)), 1, rep(0, max(x) - 7)) f2 <- (1 - theta) * dnbinom(z, r, p) f.true <- f1 + f2 true <- dnbinom(z, r, p) f.fit <- zero.mle$f xx <- sort(unique(x)) emp <- rep(0, length(z)) emp[xx - min(x) + 1] <- as.vector(table(x) / n) ## plot results plot(z, f.true, type = "l", ylim = c(0, max(emp)), xlab = "x", ylab = "PMF", main = "Illustration k-inflated estimator") points(z, true, type = "l", lty = 2) points(z, f.fit, type = "l", col = "red") points(z, zero.mle$logconc.pmf, type = "l", col = "red", lty = 2) points(min(x):max(x), f.mle, type = "l", col = "green") points(z, emp, type = "l", col = "purple") points(z, emp, col = "purple") legend("topright", inset = 0.05, c("true", "true less seven", "seven-inflated", "recovered", "logconc", "empirical"), lty=c(1, 2, 1, 2, 1, 1), col = c("black", "black", "red", "red", "green", "purple")) ## zoom in near mode subs <- 4:12 plot(z[subs], f.true[subs], type = "l", ylim = c(0, max(emp)), xlab = "x", ylab = "PMF", main = "Illustration k-inflated estimator") points(z[subs], true[subs], type = "l", lty = 2) points(z[subs], f.fit[subs], type = "l", col = "red") points(z[subs], zero.mle$logconc.pmf[subs], type = "l", col = "red", lty = 2) points((min(x):max(x))[subs], f.mle[subs], type = "l", col = "green") points(z[subs], emp[subs], type = "l", col = "purple") points(z[subs], emp[subs], col = "purple") ## End(Not run)
Compute pointwise confidence bands for the log-concave maximum likelihood estimate of a log-concave probability mass function based on the limiting theory developed in Balabdaoui et al (2012).
logConDiscrCI(dat, conf.level = 0.95, type = c("MLE", "all")[1], B = 1000, output = TRUE, seed = 2011)
logConDiscrCI(dat, conf.level = 0.95, type = c("MLE", "all")[1], B = 1000, output = TRUE, seed = 2011)
dat |
Data to compute MLE and confidence band for. |
conf.level |
The confidence level to be used. |
type |
To compute confidence bands one theoretically needs to know the knots of the true PMF. For type |
B |
Number of samples to be drawn to compute resampling quantiles. |
output |
If |
seed |
Optional seed. |
The pointwise confidence bands are based on the limiting theory in Balabdaoui et al (2011).
A list with the following components:
MLE |
The estimated MLE (simply the output list of the function |
emp |
A dataframe containing two columns: the unique sorted observations and the empirical PMF. |
CIs |
The computed confidence intervals for each |
Values outside will be clipped. As a consequence, coverage may be higher than
.
Kaspar Rufibach (maintainer) [email protected]
http://www.kasparrufibach.ch
Fadoua Balabdaoui [email protected]
http://www.ceremade.dauphine.fr/~fadoua
Hanna Jankowski [email protected]
http://www.math.yorku.ca/~hkj
Kathrin Weyermann
Balabdaoui, F., Jankowski, H., Rufibach, K., and Pavlides, M. (2013). Maximum likelihood estimation and confidence bands for a discrete log-concave distribution. J. R. Stat. Soc. Ser. B Stat. Methodol., 75(4), 769–790.
Weyermann, K. (2007). An Active Set Algorithm for Log-Concave Discrete Distributions. MSc thesis, University of Bern (Supervisor: Lutz Duembgen).
# ------------------------------------------------------------- # compute MLE and confidence bands for a random sample from a # Poisson distribution # ------------------------------------------------------------- set.seed(2011) x <- sort(rpois(n = 100, lambda = 2)) mle <- logConDiscrMLE(x) psi <- mle$psi # compute confidence bands CIs <- logConDiscrCI(x, type = "MLE", output = TRUE, seed = 20062011)$CIs # plot estimated PMF and log of estimate par(mfrow = c(1, 2), las = 1) true <- dpois(0:20, lambda = 2) plot(mle$x, exp(psi), type = "b", col = 2, xlim = c(min(x), max(x) + 1), xlab = "x", ylim = c(0, max(exp(psi), true, CIs[, 3])), ylab = "PMF", main = "Estimate MLE from a Poisson", pch = 19) legend("topright", c("truth", "MLE", "confidence bands"), col = c(4, 2, 2), lty = c(1, 1, 2), pch = c(0, 19, NA), bty = "n") # add true PMF lines(0:20, true, type = "l", col = 4) # add confidence bands matlines(CIs[, 1], CIs[, 2:3], type = "l", lty = 2, col = 2) # log-density plot(mle$x, psi, type = "p", col = 2, xlim = c(min(x), max(x) + 1), xlab = "x", ylab = "PMF", main = "Estimate MLE from a Poisson", pch = 19, ylim = c(-6, log(max(exp(psi), true, CIs[, 3])))) lines(0:20, log(true), type = "l", col = 4) # add confidence bands matlines(CIs[, 1], log(CIs[, 2:3]), type = "l", lty = 2, col = 2) # ------------------------------------------------------------- # compute confidence bands when only estimate (not original data) # are available (as a an example we simply use the estimator from # above) # ------------------------------------------------------------- x.est <- 0:6 est <- c(0.09, 0.30, 0.30, 0.19, 0.09, 0.02, 0.01) # generate original data (up to given precision) x <- rep(0:6, times = 100 * est)
# ------------------------------------------------------------- # compute MLE and confidence bands for a random sample from a # Poisson distribution # ------------------------------------------------------------- set.seed(2011) x <- sort(rpois(n = 100, lambda = 2)) mle <- logConDiscrMLE(x) psi <- mle$psi # compute confidence bands CIs <- logConDiscrCI(x, type = "MLE", output = TRUE, seed = 20062011)$CIs # plot estimated PMF and log of estimate par(mfrow = c(1, 2), las = 1) true <- dpois(0:20, lambda = 2) plot(mle$x, exp(psi), type = "b", col = 2, xlim = c(min(x), max(x) + 1), xlab = "x", ylim = c(0, max(exp(psi), true, CIs[, 3])), ylab = "PMF", main = "Estimate MLE from a Poisson", pch = 19) legend("topright", c("truth", "MLE", "confidence bands"), col = c(4, 2, 2), lty = c(1, 1, 2), pch = c(0, 19, NA), bty = "n") # add true PMF lines(0:20, true, type = "l", col = 4) # add confidence bands matlines(CIs[, 1], CIs[, 2:3], type = "l", lty = 2, col = 2) # log-density plot(mle$x, psi, type = "p", col = 2, xlim = c(min(x), max(x) + 1), xlab = "x", ylab = "PMF", main = "Estimate MLE from a Poisson", pch = 19, ylim = c(-6, log(max(exp(psi), true, CIs[, 3])))) lines(0:20, log(true), type = "l", col = 4) # add confidence bands matlines(CIs[, 1], log(CIs[, 2:3]), type = "l", lty = 2, col = 2) # ------------------------------------------------------------- # compute confidence bands when only estimate (not original data) # are available (as a an example we simply use the estimator from # above) # ------------------------------------------------------------- x.est <- 0:6 est <- c(0.09, 0.30, 0.30, 0.19, 0.09, 0.02, 0.01) # generate original data (up to given precision) x <- rep(0:6, times = 100 * est)
Compute the maximum likelihood estimate of a log-concave probability mass function from discrete i.i.d. data.
logConDiscrMLE(x, w = NA, psi_o = NA, prec = 1e-05, output = TRUE)
logConDiscrMLE(x, w = NA, psi_o = NA, prec = 1e-05, output = TRUE)
x |
Vector of observations. If |
w |
If |
psi_o |
Optional start vector. |
prec |
Precision for stopping criterion. |
output |
Logical, if |
Given a vector of observations from a discrete PMF,
logConDiscrMLE
computes a function on
with knots only
in
such that
is maximal over all log-concave PMFs , where
is the frequency of the observation
.
To accomplish this, an active set algorithm is used.
A list containing the following elementes:
x |
Vector of unique observations, sorted. |
w |
Generated weights. |
psi |
The estimated log-density on the grid of unique, sorted observations. |
L |
The value of the log-likelihood at the maximum. |
IsKnot |
Binary vector where |
xSupp |
The full support |
psiSupp |
|
Kaspar Rufibach (maintainer) [email protected]
http://www.kasparrufibach.ch
Fadoua Balabdaoui [email protected]
http://www.ceremade.dauphine.fr/~fadoua
Hanna Jankowski [email protected]
http://www.math.yorku.ca/~hkj
Kathrin Weyermann
Balabdaoui, F., Jankowski, H., Rufibach, K., and Pavlides, M. (2013). Maximum likelihood estimation and confidence bands for a discrete log-concave distribution. J. R. Stat. Soc. Ser. B Stat. Methodol., 75(4), 769–790.
Weyermann, K. (2007). An Active Set Algorithm for Log-Concave Discrete Distributions. MSc thesis, University of Bern (Supervisor: Lutz Duembgen).
# ------------------------------------------------------------- # compute MLE for a random sample from a Poisson distribution # ------------------------------------------------------------- x <- sort(rpois(n = 100, lambda = 2)) mle <- logConDiscrMLE(x) psi <- mle$psi # plot estimated PMF and log of estimate par(mfrow = c(1, 2), las = 1) true <- dpois(0:20, lambda = 2) plot(mle$x, exp(psi), type = "p", col = 2, xlim = range(x), xlab = "x", ylim = c(0, max(exp(psi), true)), ylab = "PMF", main = "Estimate MLE from a Poisson", pch = 19) legend("topright", c("truth", "MLE"), col = c(4, 2), lty = c(1, 0), pch = c(0, 19), bty = "n") # add true PMF lines(0:20, true, type = "l", col = 4) # log-density plot(mle$x, psi, type = "p", col = 2, xlim = range(x), xlab = "x", ylab = "PMF", main = "Estimate MLE from a Poisson", pch = 19) lines(0:20, log(true), type = "l", col = 4) # use a priori specified weights: mle = mle2 mle2 <- logConDiscrMLE(x = unique(x), w = table(x)) ## ------------------------------------------------------------- ## Illustrate the limit process: the code below can be used to ## to reproduce the limit process figure in Balabdaoui et al (2011) ## ------------------------------------------------------------- a <- 1 b <- 7 c <- 8 d <- 11 e <- 2 n <- 10 ^ 2 ## support x <- seq(a, d, by = 1) ## true density dens <- dTriangular(a, b, c, d, e) logdens <- log(dens) rand <- rTriangular(n, a, b, c, d, e)$rand ## empirical emp <- table(rand) / n x.emp <- names(table(rand)) ## log-concave MLE mle <- logConDiscrMLE(rand, output = FALSE) ## plot log PMF and PMF par(mfrow = c(1, 2)) plot(x, logdens, type = "l", col = 1, pch = 19, main = "log-density", xlim = c(a, d), ylim = range(range(log(emp), logdens))) lines(x, logdens, type = "l", col = 1, lwd = 0.1) points(x.emp, log(emp), col = 4, pch = 19) points(mle$x, mle$psi, col = 6, pch = 19) abline(v = mle$x[mle$isKnot == 1], lty = 3, col = 3) plot(x, dens, type = "l", col = 1, pch = 19, main = "density", xlim = c(a, d), ylim = c(0, max(dens, emp))) lines(x, dens, type = "l", col = 1, lwd = 0.1) points(x.emp, emp, col = 4, pch = 19) points(mle$x, exp(mle$psi), col = 6, pch = 19) legend("topleft", c("truth", "MLE", "knots of the MLE", "empirical"), col = c(1, 6, 3, 4), pch = c(NA, 19, NA, 19), lty = c(1, NA, 3, NA), bg = "white", bty = "n") abline(v = mle$x[mle$isKnot == 1], lty = 3, col = 3) ## ------------------------------------------------------------- ## Now compute and plot Y(x) and H(x) ## ------------------------------------------------------------- xla <- paste("x = {r = ", a, ", ..., s - 1 = ", b - 1, "}", sep = "") par(mfcol = c(2, 2), oma = rep(0, 4), mar = c(4.5, 4.5, 2, 1), las = 1) plot(x, logdens, type = "b", col = 2, pch = 19, main = "log of normalized triangular pmf", xlim = c(a, d), xaxt = "n", xlab = "x", ylab = "log of normalized pmf") axis(1, at = c(a, b, d), labels = paste(c("a = ", "b = ", "d = "), c(a, b, d), sep = "")) ## compute H(x) r <- a s <- b ind <- r:(s - 1) px <- dens p_rs <- px[ind] m <- s - r ## ------------------------------------------------------------- ## generate one observation from the distribution of U(F(x)) - U(F(x - 1)) ## ------------------------------------------------------------- sigma <- diag(m) * 0 for (i in 1:m){ for (j in 1:m){ sigma[i, j] <- p_rs[i] * (i == j) - p_rs[i] * p_rs[j] } } set.seed(23041977) cx <- rep(NA, d - a + 1) cx[ind] <- rmvnorm(1, mean = rep(0, m), sigma = sigma, method = c("eigen", "svd", "chol")[3]) Ux <- rep(NA, length(x)) Ux[ind] <- cx[ind] X <- x[ind] Y <- Ux[ind] / p_rs W <- p_rs ## concave regression using 'cobs' Res <- conreg(x = X, y = Y, w = W, verbose = TRUE) g <- Res$yf gKnots <- Res$iKnots plot(X, Y, main = expression("The concave function g* that minimizes "*Phi*"(g)"), xaxt = "n", ylab = "g*", ylim = range(c(Y, g)), xlab = xla, type = "n") axis(1, at = 0:100, labels = 0:100); abline(v = x[gKnots], lty = 2, col = grey(0.75)) lines(X, g, lwd = 2, col = 3, type = "b", pch = 1) lines(X, Y, lwd = 1, col = 2, type = "p", pch = 19) legend("bottomright", c("values of cx / px", "minimizer g*"), lty = c(NA, 1), pch = c(19, 1), col = 2:3, bty = "n", lwd = c(NA, 2)) ## compute H(x) for x = r, ..., s - 1 and plot it gstar <- rep(NA, length(x)) gstar[ind] <- g xs <- r:(s - 1) Hs <- rep(0, length(xs)) for (i in 2:length(xs)){ for (ks in r:(xs[i] - 1)){ js <- r:ks Hs[i] <- Hs[i] + sum(gstar[js] * px[js]) } } ## check (Hs[3:length(Hs)] - 2 * Hs[2:(length(Hs) - 1)] + Hs[1:(length(Hs) - 2)]) / p_rs[2:(length(Hs) - 1)] gstar ## ------------------------------------------------------------- ## plot the product of g* and px (the limit of the PMF) ## ------------------------------------------------------------- plot(x[ind], gstar[ind] * p_rs, main = expression("g"^"*"* " * p"), xaxt = "n", pch = 19, col = 2, ylab = "g*", type = "b", xlab = xla) axis(1, at = 0:100, labels = 0:100); abline(v = x[gKnots], lty = 2, col = grey(0.75)) ## compute Y(x) for x = r, ..., s - 1 and plot it Ys <- rep(0, length(xs)) for (i in 2:length(xs)){ for (ks in r:(xs[i] - 1)){ js <- r:ks Ys[i] <- Ys[i] + sum(cx[js]) } } ## plot the two processes plot(xs, Ys, type = "n", col = 2, xaxt = "n", lwd = 2, main = "The processes H(x) and Y(x)", ylab = "H and Y", xlab = xla) axis(1, at = 0:100, labels = 0:100); abline(v = x[gKnots], lty = 2, col = grey(0.75)) lines(xs, Hs, col = 2, lwd = 1, type = "b") lines(xs, Ys, col = 3, lwd = 2, type = "l", lty = 2) legend("topleft", c("H(x)", "Y(x)"), col = 2:3, lty = c(1, 2), pch = 1, bty = "n", lwd = c(1, 2))
# ------------------------------------------------------------- # compute MLE for a random sample from a Poisson distribution # ------------------------------------------------------------- x <- sort(rpois(n = 100, lambda = 2)) mle <- logConDiscrMLE(x) psi <- mle$psi # plot estimated PMF and log of estimate par(mfrow = c(1, 2), las = 1) true <- dpois(0:20, lambda = 2) plot(mle$x, exp(psi), type = "p", col = 2, xlim = range(x), xlab = "x", ylim = c(0, max(exp(psi), true)), ylab = "PMF", main = "Estimate MLE from a Poisson", pch = 19) legend("topright", c("truth", "MLE"), col = c(4, 2), lty = c(1, 0), pch = c(0, 19), bty = "n") # add true PMF lines(0:20, true, type = "l", col = 4) # log-density plot(mle$x, psi, type = "p", col = 2, xlim = range(x), xlab = "x", ylab = "PMF", main = "Estimate MLE from a Poisson", pch = 19) lines(0:20, log(true), type = "l", col = 4) # use a priori specified weights: mle = mle2 mle2 <- logConDiscrMLE(x = unique(x), w = table(x)) ## ------------------------------------------------------------- ## Illustrate the limit process: the code below can be used to ## to reproduce the limit process figure in Balabdaoui et al (2011) ## ------------------------------------------------------------- a <- 1 b <- 7 c <- 8 d <- 11 e <- 2 n <- 10 ^ 2 ## support x <- seq(a, d, by = 1) ## true density dens <- dTriangular(a, b, c, d, e) logdens <- log(dens) rand <- rTriangular(n, a, b, c, d, e)$rand ## empirical emp <- table(rand) / n x.emp <- names(table(rand)) ## log-concave MLE mle <- logConDiscrMLE(rand, output = FALSE) ## plot log PMF and PMF par(mfrow = c(1, 2)) plot(x, logdens, type = "l", col = 1, pch = 19, main = "log-density", xlim = c(a, d), ylim = range(range(log(emp), logdens))) lines(x, logdens, type = "l", col = 1, lwd = 0.1) points(x.emp, log(emp), col = 4, pch = 19) points(mle$x, mle$psi, col = 6, pch = 19) abline(v = mle$x[mle$isKnot == 1], lty = 3, col = 3) plot(x, dens, type = "l", col = 1, pch = 19, main = "density", xlim = c(a, d), ylim = c(0, max(dens, emp))) lines(x, dens, type = "l", col = 1, lwd = 0.1) points(x.emp, emp, col = 4, pch = 19) points(mle$x, exp(mle$psi), col = 6, pch = 19) legend("topleft", c("truth", "MLE", "knots of the MLE", "empirical"), col = c(1, 6, 3, 4), pch = c(NA, 19, NA, 19), lty = c(1, NA, 3, NA), bg = "white", bty = "n") abline(v = mle$x[mle$isKnot == 1], lty = 3, col = 3) ## ------------------------------------------------------------- ## Now compute and plot Y(x) and H(x) ## ------------------------------------------------------------- xla <- paste("x = {r = ", a, ", ..., s - 1 = ", b - 1, "}", sep = "") par(mfcol = c(2, 2), oma = rep(0, 4), mar = c(4.5, 4.5, 2, 1), las = 1) plot(x, logdens, type = "b", col = 2, pch = 19, main = "log of normalized triangular pmf", xlim = c(a, d), xaxt = "n", xlab = "x", ylab = "log of normalized pmf") axis(1, at = c(a, b, d), labels = paste(c("a = ", "b = ", "d = "), c(a, b, d), sep = "")) ## compute H(x) r <- a s <- b ind <- r:(s - 1) px <- dens p_rs <- px[ind] m <- s - r ## ------------------------------------------------------------- ## generate one observation from the distribution of U(F(x)) - U(F(x - 1)) ## ------------------------------------------------------------- sigma <- diag(m) * 0 for (i in 1:m){ for (j in 1:m){ sigma[i, j] <- p_rs[i] * (i == j) - p_rs[i] * p_rs[j] } } set.seed(23041977) cx <- rep(NA, d - a + 1) cx[ind] <- rmvnorm(1, mean = rep(0, m), sigma = sigma, method = c("eigen", "svd", "chol")[3]) Ux <- rep(NA, length(x)) Ux[ind] <- cx[ind] X <- x[ind] Y <- Ux[ind] / p_rs W <- p_rs ## concave regression using 'cobs' Res <- conreg(x = X, y = Y, w = W, verbose = TRUE) g <- Res$yf gKnots <- Res$iKnots plot(X, Y, main = expression("The concave function g* that minimizes "*Phi*"(g)"), xaxt = "n", ylab = "g*", ylim = range(c(Y, g)), xlab = xla, type = "n") axis(1, at = 0:100, labels = 0:100); abline(v = x[gKnots], lty = 2, col = grey(0.75)) lines(X, g, lwd = 2, col = 3, type = "b", pch = 1) lines(X, Y, lwd = 1, col = 2, type = "p", pch = 19) legend("bottomright", c("values of cx / px", "minimizer g*"), lty = c(NA, 1), pch = c(19, 1), col = 2:3, bty = "n", lwd = c(NA, 2)) ## compute H(x) for x = r, ..., s - 1 and plot it gstar <- rep(NA, length(x)) gstar[ind] <- g xs <- r:(s - 1) Hs <- rep(0, length(xs)) for (i in 2:length(xs)){ for (ks in r:(xs[i] - 1)){ js <- r:ks Hs[i] <- Hs[i] + sum(gstar[js] * px[js]) } } ## check (Hs[3:length(Hs)] - 2 * Hs[2:(length(Hs) - 1)] + Hs[1:(length(Hs) - 2)]) / p_rs[2:(length(Hs) - 1)] gstar ## ------------------------------------------------------------- ## plot the product of g* and px (the limit of the PMF) ## ------------------------------------------------------------- plot(x[ind], gstar[ind] * p_rs, main = expression("g"^"*"* " * p"), xaxt = "n", pch = 19, col = 2, ylab = "g*", type = "b", xlab = xla) axis(1, at = 0:100, labels = 0:100); abline(v = x[gKnots], lty = 2, col = grey(0.75)) ## compute Y(x) for x = r, ..., s - 1 and plot it Ys <- rep(0, length(xs)) for (i in 2:length(xs)){ for (ks in r:(xs[i] - 1)){ js <- r:ks Ys[i] <- Ys[i] + sum(cx[js]) } } ## plot the two processes plot(xs, Ys, type = "n", col = 2, xaxt = "n", lwd = 2, main = "The processes H(x) and Y(x)", ylab = "H and Y", xlab = xla) axis(1, at = 0:100, labels = 0:100); abline(v = x[gKnots], lty = 2, col = grey(0.75)) lines(xs, Hs, col = 2, lwd = 1, type = "b") lines(xs, Ys, col = 3, lwd = 2, type = "l", lty = 2) legend("topleft", c("H(x)", "Y(x)"), col = 2:3, lty = c(1, 2), pch = 1, bty = "n", lwd = c(1, 2))
In Balabdaoui et al (2012) the triangular density, defined as
for and
for , was used to illustrate the limit process of the log-concave MLE. In order to provide the code to generate the limit process figure in Balabdaoui et al (2012, see the example in
logConDiscrMLE
for the code to generate that figure) the functions dTriangular
and rTriangular
are included in this package. Note that rTriangular
uses the rejection sampling algorithm in Devroye (1987) which was specifically developed to generate random numbers from a log-concave PMF.
dTriangular(a, b, c, d, e) rTriangular(n, a, b, c, d, e)
dTriangular(a, b, c, d, e) rTriangular(n, a, b, c, d, e)
a |
Left endpoint of triangular pmf. |
b |
Mode of triangular pmf. |
c |
Height at mode. |
d |
Left endpoint. |
e |
Height at left endpoint. |
n |
Number of random numbers to generate. |
dTriangular
returns a vector containing the value of the PMF at all values in .
rTriangular
returns a list containing the elements:
rand |
Vector with generated random numbers of length |
x |
Vector |
dens |
Value of the pmf at |
This function is used to generate the plot of the limit process in the help file for the function logConDiscrMLE
.
Kaspar Rufibach (maintainer) [email protected]
http://www.kasparrufibach.ch
Fadoua Balabdaoui [email protected]
http://www.ceremade.dauphine.fr/~fadoua
Hanna Jankowski [email protected]
http://www.math.yorku.ca/~hkj
Kathrin Weyermann
Balabdaoui, F., Jankowski, H., Rufibach, K., and Pavlides, M. (2013). Maximum likelihood estimation and confidence bands for a discrete log-concave distribution. J. R. Stat. Soc. Ser. B Stat. Methodol., 75(4), 769–790.
Devroye, L. (1987). A simple generator for discrete log-concave distributions. Computing, 39, 87-91.
## ------------------------------------------------------------- ## compute values of triangular density and simulate from it ## ------------------------------------------------------------- a <- 1 b <- 7 c <- 8 d <- 11 e <- 2 n <- 10 ^ 2 ## support x <- seq(a, d, by = 1) ## true density dens <- dTriangular(a, b, c, d, e) logdens <- log(dens) rand <- rTriangular(n, a, b, c, d, e)$rand ## does the same as rTriangular() rand2 <- sample(x = a:d, size = n, prob = dens, replace = TRUE)
## ------------------------------------------------------------- ## compute values of triangular density and simulate from it ## ------------------------------------------------------------- a <- 1 b <- 7 c <- 8 d <- 11 e <- 2 n <- 10 ^ 2 ## support x <- seq(a, d, by = 1) ## true density dens <- dTriangular(a, b, c, d, e) logdens <- log(dens) rand <- rTriangular(n, a, b, c, d, e)$rand ## does the same as rTriangular() rand2 <- sample(x = a:d, size = n, prob = dens, replace = TRUE)