Title: | Generate "LaTeX"" Tables of Descriptive Statistics |
---|---|
Description: | These functions are especially helpful when writing reports of data analysis using "Sweave". |
Authors: | Kaspar Rufibach |
Maintainer: | Kaspar Rufibach <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1.3 |
Built: | 2024-11-10 05:14:41 UTC |
Source: | https://github.com/cran/reporttools |
Provides functions to generate tables of descriptive statistics for continuous and nominal variables, as well as some further data manipulation functions. These functions are especially helpful when writing reports of data analysis using Sweave.
Package: | reporttools |
Type: | Package |
Version: | 1.1.3 |
Date: | 2021-10-10 |
Depends: | xtable, survival |
License: | GPL (>=2) |
At the beginning of data analysis, it is often useful to have tables of descriptive values for continuous and nominal variables available. This package provides such functions, where the output is a LaTeX table. The functions are most efficiently used when generating reports combining LaTeX with R via Sweave.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
I thank Daniel Sabanes-Bove, Sarah Haile, Philipp Muri, Patrich McCormick, and Sina Rueeger for helpful discussions and remarks.
Rufibach, K. (2009)
reporttools: R-Functions to Generate LaTeX Tables of Descriptive Statistics.
Journal of Statistical Software, Code Snippets, 31(1).
doi:10.18637/jss.v031.c01.
Given a dataframe with a column containing character string, generate a new dataframe where these strings have a maximal length. Useful when embedding dataframes in a Sweave document, without having it overlapping page width.
addLineBreak(tab, length, col)
addLineBreak(tab, length, col)
tab |
Dataframe containing the data. |
length |
Maximal length to which strings should be broken. |
col |
Column of |
List with two elements: The resulting dataframe with lines broken, and a vector that gives row where each entry in the new dataframe ends. The latter is useful when horizontal lines should be added when using xtable.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
tab <- data.frame(cbind(1:4)) tab[1, 2] <- paste(letters, sep = "", collapse = "") tab[3, 2] <- paste(LETTERS, sep = "", collapse = "") tab[c(2, 4), 2] <- "" colnames(tab) <- c("nr", "text") tab addLineBreak(tab, length = 12, col = 2)
tab <- data.frame(cbind(1:4)) tab[1, 2] <- paste(letters, sep = "", collapse = "") tab[3, 2] <- paste(LETTERS, sep = "", collapse = "") tab[c(2, 4), 2] <- "" colnames(tab) <- c("nr", "text") tab addLineBreak(tab, length = 12, col = 2)
Attach levels "absent" and "present" to a 0-1 vector and turn it into a factor.
attachPresAbs(v)
attachPresAbs(v)
v |
Vector. |
Factor with the corresponding levels.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
set.seed(1977) vec <- round(runif(10, 0, 1)) attachPresAbs(vec)
set.seed(1977) vec <- round(runif(10, 0, 1)) attachPresAbs(vec)
Attach levels "no" and "yes" to a 0-1 vector and turn it into a factor.
attachYesNo(v)
attachYesNo(v)
v |
Vector. |
Factor with the corresponding levels.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
set.seed(1977) vec <- round(runif(10, 0, 1)) attachYesNo(vec)
set.seed(1977) vec <- round(runif(10, 0, 1)) attachYesNo(vec)
Given two vectors and
of date type, this function outputs all entries
and
such that
.
checkDateSuccession(d1, d2, pat, names = NA, lab = "", typ = c("R", "tex")[2])
checkDateSuccession(d1, d2, pat, names = NA, lab = "", typ = c("R", "tex")[2])
d1 |
Supposedly earlier dates. |
d2 |
Supposedly later dates. |
pat |
Corresponding list of patient (observation) numbers. |
names |
Names of date vectors, of length 3. |
lab |
Label of the generated latex table. |
typ |
Type of output. |
A latex table is output.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
set.seed(1977) diagnosis <- as.Date(round(runif(10, min = 35000, max = 40000)), origin = "1899-12-30") death <- as.Date(round(runif(10, min = 35000, max = 40000)), origin = "1899-12-30") ## check whether diagnosis was before death checkDateSuccession(diagnosis, death, 1:10, names = c("Pat", "diagnosis", "death"), lab = "tab: diag --> death") checkDateSuccession(diagnosis, death, 1:10, names = c("Pat", "diagnosis", "death"), lab = "tab: diag --> death", typ = "R")
set.seed(1977) diagnosis <- as.Date(round(runif(10, min = 35000, max = 40000)), origin = "1899-12-30") death <- as.Date(round(runif(10, min = 35000, max = 40000)), origin = "1899-12-30") ## check whether diagnosis was before death checkDateSuccession(diagnosis, death, 1:10, names = c("Pat", "diagnosis", "death"), lab = "tab: diag --> death") checkDateSuccession(diagnosis, death, 1:10, names = c("Pat", "diagnosis", "death"), lab = "tab: diag --> death", typ = "R")
Often, one does not want to span a data frame over several pages. This function
breaks a data frame in a data frame with
ceiling(n / cols)
rows and cols * p
columns.
colToMat(tab, cols)
colToMat(tab, cols)
tab |
The data frame to be reformatted. |
cols |
Number of columns of the reformatted data.frame. |
Returns the reformatted data frame.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Replace all relevant characters in the entries and row- and colnames of a data frame
such that xtable
does not complain displaying them.
correctVarNames(tab, rowcol = TRUE, cols = 1:ncol(tab))
correctVarNames(tab, rowcol = TRUE, cols = 1:ncol(tab))
tab |
The data frame to be formatted. |
rowcol |
If |
cols |
Provide a vector of column indices of columns whose entries are to be reformatted. If |
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
This function serves to display numbers in plain text, using a given number of digits after the comma.
disp(n, d1 = 2, d2 = 1)
disp(n, d1 = 2, d2 = 1)
n |
Vector of real numbers to be displayed. |
d1 |
Number of digits numbers are basically rounded to. |
d2 |
If numbers in |
t |
A vector of character strings containing the input number |
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
r <- c(0.23445, 0.000089) disp(r)
r <- c(0.23445, 0.000089) disp(r)
This function serves to display a confidence interval in plain text, taking a vector of length 2
or a -matrix containing the confidence limits and given number of digits after the comma.
A unit can be additionally supplied.
displayCI(ci, digit = 2, unit = "", text = "none")
displayCI(ci, digit = 2, unit = "", text = "none")
ci |
Vector of length 2 or matrix of size |
digit |
Number of digits after the comma. |
unit |
Character string denoting a unit of measurement. |
text |
Specifies the way how the confidence interval should be displayed. |
A character string to be inserted in plain text.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
a <- 0.05 k <- qnorm(p = 1 - a / 2) x <- 50 n <- 100 wilson.ci <- (x + k ^ 2 / 2) / (n + k ^ 2) + c(-1, 1) * (k * n ^ 0.5) / (n + k ^ 2) * sqrt(x / n * (1 - x / n) + k ^ 2 / (4 * n)) displayCI(wilson.ci) displayCI(wilson.ci, digit = 1, unit = "cm", text = "none") displayCI(wilson.ci, digit = 1, unit = "cm", text = "english")
a <- 0.05 k <- qnorm(p = 1 - a / 2) x <- 50 n <- 100 wilson.ci <- (x + k ^ 2 / 2) / (n + k ^ 2) + c(-1, 1) * (k * n ^ 0.5) / (n + k ^ 2) * sqrt(x / n * (1 - x / n) + k ^ 2 / (4 * n)) displayCI(wilson.ci) displayCI(wilson.ci, digit = 1, unit = "cm", text = "none") displayCI(wilson.ci, digit = 1, unit = "cm", text = "english")
Generate a LaTeX table of a coxph
object. To be used in a Sweave document.
displayCoxPH(mod, cap = "", lab = "mod", dig.coef = 2, dig.p = 1)
displayCoxPH(mod, cap = "", lab = "mod", dig.coef = 2, dig.p = 1)
mod |
|
cap |
The function provides an automatic caption displaying the number of observations and events in |
lab |
The LaTeX label for the generated table. |
dig.coef |
The number of significant digits for the estimated coefficients and the hazard ratios. |
dig.p |
The number of significant digits for |
Returns a LaTeX table containing columns with the estimated coefficients, hazard ratios, 95 percent confidence intervals for the hazard ratios and
the -values.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
## Not run: # use example from coxph() in library 'survival' test1 <- list(time = c(4, 3, 1, 1, 2, 2, 3), status = c(1, 1, 1, 0, 1, 1, 0), x = c(0, 2, 1, 1, 1, 0, 0), sex = c(0, 0, 0, 0, 1, 1, 1)) # fit a coxph() model mod1 <- coxph(Surv(time, status) ~ x + sex, data = test1) # generate table to insert in Sweave file m1 <- displayCoxPH(mod1) ## End(Not run)
## Not run: # use example from coxph() in library 'survival' test1 <- list(time = c(4, 3, 1, 1, 2, 2, 3), status = c(1, 1, 1, 0, 1, 1, 0), x = c(0, 2, 1, 1, 1, 0, 0), sex = c(0, 0, 0, 0, 1, 1, 1)) # fit a coxph() model mod1 <- coxph(Surv(time, status) ~ x + sex, data = test1) # generate table to insert in Sweave file m1 <- displayCoxPH(mod1) ## End(Not run)
For each column of a dataframe, generate a LaTeX table against a given variable using displayKbyC
and add a suitable -value:
If the expected frequencies are all
then a
-test is computed, otherwise Fisher's exact test.
displayCrossTabs(vars, v0, nam0, lab0, percentage = c("none", "row", "col", "total")[1], add.p = TRUE)
displayCrossTabs(vars, v0, nam0, lab0, percentage = c("none", "row", "col", "total")[1], add.p = TRUE)
vars |
Dataframe of nominal variables. |
v0 |
Nominal variable to tabulate all columns of |
nam0 |
Name of |
lab0 |
Initial string for table label. The column number of |
percentage |
Add percentages with respect to row, column, or table total. |
add.p |
Logical. If true, add |
Displays LaTeX K x C tables and returns a list containing all the information.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
set.seed(1977) v0 <- round(runif(20, 0, 5)) v1 <- round(runif(20, 0, 3)) v2 <- round(runif(20, 0, 4)) displayCrossTabs(vars = data.frame(v1, v2), v0, nam0 = "v0", lab0 = "Q1")
set.seed(1977) v0 <- round(runif(20, 0, 5)) v1 <- round(runif(20, 0, 3)) v2 <- round(runif(20, 0, 4)) displayCrossTabs(vars = data.frame(v1, v2), v0, nam0 = "v0", lab0 = "Q1")
Generate a LaTeX table of a frequency table that contains not only the cell frequencies, but also
pre-specified row- and col-names as well as totals of rows and cols.
displayKbyC(v1, v2, percentage = c("none", "row", "col", "total")[1], names = c("v1", "v2"), cap = "", lab = "", row.nam = NA, col.nam = NA)
displayKbyC(v1, v2, percentage = c("none", "row", "col", "total")[1], names = c("v1", "v2"), cap = "", lab = "", row.nam = NA, col.nam = NA)
v1 |
Vector with |
v2 |
Vector with |
percentage |
Add percentages with respect to row, column, or table total. |
names |
Names of the vectors under consideration. |
cap |
Caption of the LaTeX table to be generated. |
lab |
Label of the LaTeX table to be generated. |
row.nam |
Labels of |
col.nam |
Labels of |
Returns a LaTeX K x C table, together with the resulting computations. If you use this function in an .rnw file, you need to assign it to a (dummy) variable name in order for the results beyond the LaTeX table not to appear in the .tex file.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
set.seed(1977) v1 <- round(runif(20, 0, 3)) v2 <- round(runif(20, 0, 5)) displayKbyC(v1, v2, percentage = "row", names = c("v1", "v2"), cap = "", lab = "", row.nam = NA, col.nam = NA)
set.seed(1977) v1 <- round(runif(20, 0, 3)) v2 <- round(runif(20, 0, 5)) displayKbyC(v1, v2, percentage = "row", names = c("v1", "v2"), cap = "", lab = "", row.nam = NA, col.nam = NA)
Generates two matrices: One with complete observations and one with all observations containing at least one missing value.
eliminateNA(dat)
eliminateNA(dat)
dat |
Dataframe with observations in rows. |
complete |
Dataframe containing complete observations. |
incomplete |
Dataframe containing observations with at least one |
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
pat <- 1:10; var1 <- rnorm(10); var2 <- factor(round(rgamma(10, 2, 1))) dat <- data.frame(cbind(pat, var1, var2)) dat[c(2, 8), 3] <- NA eliminateNA(dat)
pat <- 1:10; var1 <- rnorm(10); var2 <- factor(round(rgamma(10, 2, 1))) dat <- data.frame(cbind(pat, var1, var2)) dat[c(2, 8), 3] <- NA eliminateNA(dat)
Takes a number and formats it as a percentage.
Leo Held
[email protected]
formatPval
is intended for formatting -values, and is based on
the function
format.pval
in the base R-package.
formatPval(pv, digits = max(1, getOption("digits") - 2), eps = 0.0001, na.form = "NA", scientific = FALSE, includeEquality=FALSE)
formatPval(pv, digits = max(1, getOption("digits") - 2), eps = 0.0001, na.form = "NA", scientific = FALSE, includeEquality=FALSE)
pv |
a numeric vector. |
digits |
how many significant digits are to be used. |
eps |
a numerical tolerance: see ‘Details’. |
na.form |
character representation of |
scientific |
use scientific number format (not by default) |
includeEquality |
include equality signs in front of the large |
formatPval
is mainly an auxiliary function for the family of
table functions, but can also be useful on its own. If a -value is
smaller than
eps
, we return just that it is smaller than the
threshold but no longer the exact value. This function is more general
than format.pval
the behaviour of which can (almost) be
obtained by using the options eps = .Machine$double.eps
and
scientific = TRUE
.
A character vector.
## include equality signs? formatPval(c(stats::runif(5), pi^-100, NA)) formatPval(c(stats::runif(5), pi^-100, NA), include=TRUE) ## try another eps argument formatPval(c(0.1, 0.0001, 1e-7)) formatPval(c(0.1, 0.0001, 1e-7), eps=1e-7) ## only the white space can differ with the base function result: (a <- formatPval(c(0.1, 0.0001, 1e-27), eps = .Machine$double.eps, scientific = TRUE)) (b <- format.pval(c(0.1, 0.0001, 1e-27))) all.equal(a, b)
## include equality signs? formatPval(c(stats::runif(5), pi^-100, NA)) formatPval(c(stats::runif(5), pi^-100, NA), include=TRUE) ## try another eps argument formatPval(c(0.1, 0.0001, 1e-7)) formatPval(c(0.1, 0.0001, 1e-7), eps=1e-7) ## only the white space can differ with the base function result: (a <- formatPval(c(0.1, 0.0001, 1e-27), eps = .Machine$double.eps, scientific = TRUE)) (b <- format.pval(c(0.1, 0.0001, 1e-27))) all.equal(a, b)
Used by the tabulating functions to format column titles.
getFonts(font)
getFonts(font)
font |
Provide font type. |
Returns function to format column titles.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Enclose a string in math dollars.
math(x)
math(x)
x |
Character string. |
Returns x as a string within math dollars.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Extract all the missing values in a factor variable and turn them into a separate category.
NAtoCategory(fact, label = "missing")
NAtoCategory(fact, label = "missing")
fact |
Factor variable. |
label |
Label to be given to the missing valus. |
Updated factor variable.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
set.seed(1977) fact <- factor(sample(c(round(runif(10, 1, 3)), rep(NA, 10))), levels = 1:3, labels = c("no", "maybe", "yes")) NAtoCategory(fact)
set.seed(1977) fact <- factor(sample(c(round(runif(10, 1, 3)), rep(NA, 10))), levels = 1:3, labels = c("no", "maybe", "yes")) NAtoCategory(fact)
Extract all the missing values in a vector and turn them into a given value.
NAtoZero(v, value = 0)
NAtoZero(v, value = 0)
v |
Vector. |
value |
Value to be given to the missing valus. |
Updated vector.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
set.seed(1977) vec <- sample(c(round(runif(10, 1, 3)), rep(NA, 10))) NAtoZero(vec)
set.seed(1977) vec <- sample(c(round(runif(10, 1, 3)), rep(NA, 10))) NAtoZero(vec)
Depending on the value of the smallest expected count compute either a or Fisher's exact test.
nominalTest(tab, limit.exp = 5)
nominalTest(tab, limit.exp = 5)
tab |
Frequency table, received by applying |
limit.exp |
If the smallest expected count is at most |
A list containing:
p |
The computed |
test |
A string indicating the test that was used. |
v1 <- as.factor(round(runif(40, 0, 3))) v2 <- as.factor(round(runif(40, 2, 3))) tab <- table(v1, v2) nominalTest(tab)
v1 <- as.factor(round(runif(40, 0, 3))) v2 <- as.factor(round(runif(40, 2, 3))) tab <- table(v1, v2) nominalTest(tab)
Similar to pairwise.wilcox.test
and pairwise.t.test
, calculate pairwise comparisons of a nominal variable between group levels with corrections for multiple testing.
pairwise.fisher.test(x, g, p.adjust.method, ...)
pairwise.fisher.test(x, g, p.adjust.method, ...)
x |
Response vector, nominal (or ordinal). |
g |
Grouping vector or factor. |
p.adjust.method |
Method for adjusting |
... |
Additional arguments to pass to |
Object of class "pairwise.htest"
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
fisher.test
, p.adjust
, pairwise.wilcox.test
, pairwise.t.test
set.seed(1977) x <- factor(abs(round(rnorm(99, 0, 1)))) g <- factor(round(runif(99, 0, 2))) pairwise.fisher.test(x, g, p.adjust.method = "holm")
set.seed(1977) x <- factor(abs(round(rnorm(99, 0, 1)))) g <- factor(round(runif(99, 0, 2))) pairwise.fisher.test(x, g, p.adjust.method = "holm")
Many data analyses start with a display of descriptive statistics of important variables. This function takes a data frame of continuous variables and possible grouping (such as e.g. treatment), weighting, and subset variables and provides a LaTeX table of descriptive statistics separately per group and jointly for all observations, per variable. User-defined statistics can be provided.
tableContinuous(vars, weights = NA, subset = NA, group = NA, stats = c("n", "min", "q1", "median", "mean", "q3", "max", "s", "iqr", "na"), prec = 1, col.tit = NA, col.tit.font = c("bf", "", "sf", "it", "rm"), print.pval = c("none", "anova", "kruskal"), pval.bound = 10^-4, declare.zero = 10^-10, cap = "", lab = "", font.size = "footnotesize", longtable = TRUE, disp.cols = NA, nams = NA, ...)
tableContinuous(vars, weights = NA, subset = NA, group = NA, stats = c("n", "min", "q1", "median", "mean", "q3", "max", "s", "iqr", "na"), prec = 1, col.tit = NA, col.tit.font = c("bf", "", "sf", "it", "rm"), print.pval = c("none", "anova", "kruskal"), pval.bound = 10^-4, declare.zero = 10^-10, cap = "", lab = "", font.size = "footnotesize", longtable = TRUE, disp.cols = NA, nams = NA, ...)
vars |
A data frame containing continuous variables. See |
weights |
Optional vector of weights of each observation. |
subset |
Optional logical vector, indicates subset of observations to be used. |
group |
Optional grouping variable. |
stats |
Specify which descriptive statistics should be displayed in the table, by either directly providing
one or more of the default character strings (in arbitrary order) or a user-defined function. A user-defined
function must bear a name, take a vector as an argument ( |
prec |
Specify number of decimals to be displayed. |
col.tit |
Specify titles of columns. Note that the length of this vector must be equal to the length of
|
col.tit.font |
If |
print.pval |
If |
pval.bound |
|
declare.zero |
Computed descriptive statistics (not |
cap |
The caption of the resulting LaTeX table. |
lab |
The label of the resulting LaTeX table. |
font.size |
Font size for the generated table in LaTeX. |
longtable |
If |
disp.cols |
Only included for backward compatibility. Needs to be a vector built of (some of) the default statistics
character strings if not equal to |
nams |
A vector of strings, containing the names corresponding to the variables in |
... |
Arguments pass through to |
Outputs the LaTeX table.
If either one of the arguments group
, weights
, or subset
is different from NA
and if vars
is a list, then it is assumed that all variables
in vars
are of equal length.
If longtable = TRUE
(which is the default), the function generates a table that may be more than one page
long, you need to include the package longtable in the LaTeX source.
If a list of variables is given to vars
, not all of these variables need to be of the same length. However,
note the Warning above.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Rufibach, K. (2009)
reporttools: R-Functions to Generate LaTeX Tables of Descriptive Statistics.
Journal of Statistical Software, Code Snippets, 31(1).
doi:10.18637/jss.v031.c01.
data(CO2) vars <- CO2[, 4:5] group <- CO2[, "Treatment"] weights <- c(rep(1, 60), rep(0, 10), rep(2, 14)) ## display default statistics, provide neither group nor weights tableContinuous(vars = vars, stats = c("n", "min", "mean", "median", "max", "iqr", "na"), print.pval = "kruskal", cap = "Table of continuous variables.", lab = "tab: descr stat") ## display default statistics, only use a subset of observations, grouped analysis tableContinuous(vars = vars, weights = weights, subset = c(rep(TRUE, 57), rep(FALSE, 100 - 57)), group = group, prec = 3, print.pval = "kruskal", cap = "Table of continuous variables.", lab = "tab: descr stat") ## supply user-defined statistics: trimmed mean and IQR as an unbiased estimate ## of the population standard deviation in case of normal data my.stats <- list("n", "na", "mean", "$\\bar{x}_{trim}$" = function(x){return(mean(x, trim = .05))}, "iqr", "IQR.unbiased" = function(x){return(IQR(x) / (2 * qnorm(3 / 4)))}) tableContinuous(vars = vars, weights = weights, group = group, stats = my.stats, prec = 3, print.pval = "none", cap = "Table of continuous variables.", lab = "tab: descr stat") ## disp.cols and nams can still be used, for backward compatibility. ## If a list is given to vars, the variables can be of different length. However, ## then weights, subset, and group must be set to NA (the default). tableContinuous(vars = list(CO2$conc, CO2$uptake, rnorm(1111), runif(2222)), nams = c("conc", "uptake", "random1", "random2"), disp.cols = c("n", "min", "median", "max", "iqr", "na"), cap = "Table of continuous variables.", lab = "tab: descr stat")
data(CO2) vars <- CO2[, 4:5] group <- CO2[, "Treatment"] weights <- c(rep(1, 60), rep(0, 10), rep(2, 14)) ## display default statistics, provide neither group nor weights tableContinuous(vars = vars, stats = c("n", "min", "mean", "median", "max", "iqr", "na"), print.pval = "kruskal", cap = "Table of continuous variables.", lab = "tab: descr stat") ## display default statistics, only use a subset of observations, grouped analysis tableContinuous(vars = vars, weights = weights, subset = c(rep(TRUE, 57), rep(FALSE, 100 - 57)), group = group, prec = 3, print.pval = "kruskal", cap = "Table of continuous variables.", lab = "tab: descr stat") ## supply user-defined statistics: trimmed mean and IQR as an unbiased estimate ## of the population standard deviation in case of normal data my.stats <- list("n", "na", "mean", "$\\bar{x}_{trim}$" = function(x){return(mean(x, trim = .05))}, "iqr", "IQR.unbiased" = function(x){return(IQR(x) / (2 * qnorm(3 / 4)))}) tableContinuous(vars = vars, weights = weights, group = group, stats = my.stats, prec = 3, print.pval = "none", cap = "Table of continuous variables.", lab = "tab: descr stat") ## disp.cols and nams can still be used, for backward compatibility. ## If a list is given to vars, the variables can be of different length. However, ## then weights, subset, and group must be set to NA (the default). tableContinuous(vars = list(CO2$conc, CO2$uptake, rnorm(1111), runif(2222)), nams = c("conc", "uptake", "random1", "random2"), disp.cols = c("n", "min", "median", "max", "iqr", "na"), cap = "Table of continuous variables.", lab = "tab: descr stat")
Many data analyses start with a display of descriptive statistics of important variables. This function takes a data frame of date variables and possible grouping (such as e.g. treatment), weighting, and subset variables and provides a LaTeX table of descriptive statistics separately per group and jointly for all observations, per variable.
tableDate(vars, weights = NA, subset = NA, group = NA, stats = c("n", "min", "q1", "median", "mean", "q3", "max", "na"), col.tit = NA, col.tit.font = c("bf", "", "sf", "it", "rm"), print.pval = TRUE, pval.bound = 10^-4, cap = "", lab = "", font.size = "footnotesize", longtable = TRUE, disp.cols = NA, nams = NA, ...)
tableDate(vars, weights = NA, subset = NA, group = NA, stats = c("n", "min", "q1", "median", "mean", "q3", "max", "na"), col.tit = NA, col.tit.font = c("bf", "", "sf", "it", "rm"), print.pval = TRUE, pval.bound = 10^-4, cap = "", lab = "", font.size = "footnotesize", longtable = TRUE, disp.cols = NA, nams = NA, ...)
vars |
A data frame of date variables. See |
weights |
Optional vector of weights of each observation. |
subset |
Optional logical vector, indicates subset of observations to be used. |
group |
Optional grouping variable. |
stats |
Specify which descriptive statistics should be displayed in the table, by either directly providing one or more of the default character strings (in arbitrary order). |
col.tit |
Specify titles of columns. |
col.tit.font |
If |
print.pval |
If |
pval.bound |
|
cap |
The caption of the resulting LaTeX table. |
lab |
The label of the resulting LaTeX table. |
font.size |
Font size for the generated table in LaTeX. |
longtable |
If |
disp.cols |
Only included for backward compatibility. Needs to be a vector of (some of) the default
statistics character strings if not equal to |
nams |
A vector of strings, containing the names corresponding to the variables in |
... |
Arguments pass through to |
Outputs the LaTeX table.
If either one of the arguments group
, weights
, or subset
is different from NA
and if vars
is a list, then it is assumed that all variables
in vars
are of equal length.
If longtable = TRUE
(which is the default), the function generates a table that may be more than one page
long, you need to include the package longtable in the LaTeX source.
If a list of variables is given to vars
, not all of these variables need to be of the same length. However,
note the Warning below.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Rufibach, K. (2009)
reporttools: R-Functions to Generate LaTeX Tables of Descriptive Statistics.
Journal of Statistical Software, Code Snippets, 31(1).
doi:10.18637/jss.v031.c01.
set.seed(1977) diagnosis <- as.Date(round(runif(10, min = 35000, max = 40000)), origin = "1899-12-30") death <- as.Date(round(runif(10, min = 35000, max = 40000)), origin = "1899-12-30") vars <- data.frame(diagnosis, death) group <- sample(c(rep("A", 5), rep("B", 5))) tableDate(vars = vars, group = group, stats = c("n", "min", "median", "max", "na"), cap = "Table of date variables.", lab = "tab: descr stat date") ## suppose we have weighted observations weights <- c(2, 3, 1, 4, rep(1, 6)) subset <- 1:5 tableDate(vars = vars, weights = weights, subset = subset, cap = "Table of date variables.", lab = "tab: descr stat date") ## For backward compatibility, disp.cols and nams are still working. ## If a list is given to vars, the variables can be of different length. ## However, then weights, subset, and group must be set to NA (the default). tableDate(vars = list(diagnosis, death), nams = c("Diagnosis", "Death"), disp.cols = c("n", "na", "min", "max"), print.pval = FALSE, cap = "Table of date variables.", lab = "tab: descr stat date")
set.seed(1977) diagnosis <- as.Date(round(runif(10, min = 35000, max = 40000)), origin = "1899-12-30") death <- as.Date(round(runif(10, min = 35000, max = 40000)), origin = "1899-12-30") vars <- data.frame(diagnosis, death) group <- sample(c(rep("A", 5), rep("B", 5))) tableDate(vars = vars, group = group, stats = c("n", "min", "median", "max", "na"), cap = "Table of date variables.", lab = "tab: descr stat date") ## suppose we have weighted observations weights <- c(2, 3, 1, 4, rep(1, 6)) subset <- 1:5 tableDate(vars = vars, weights = weights, subset = subset, cap = "Table of date variables.", lab = "tab: descr stat date") ## For backward compatibility, disp.cols and nams are still working. ## If a list is given to vars, the variables can be of different length. ## However, then weights, subset, and group must be set to NA (the default). tableDate(vars = list(diagnosis, death), nams = c("Diagnosis", "Death"), disp.cols = c("n", "na", "min", "max"), print.pval = FALSE, cap = "Table of date variables.", lab = "tab: descr stat date")
Many data analyses start with a display of descriptive statistics of important variables. This function takes a data frame of nominal variables and possible grouping (such as e.g. treatment), weighting, and subset variables and provides a LaTeX table of descriptive statistics separately per group and jointly for all observations, per variable.
tableNominal(vars, weights = NA, subset = NA, group = NA, miss.cat = NA, print.pval = c("none", "fisher", "chi2"), pval.bound = 10^-4, fisher.B = 2000, vertical = TRUE, cap = "", lab = "", col.tit.font = c("bf", "", "sf", "it", "rm"), font.size = "footnotesize", longtable = TRUE, nams = NA, cumsum = TRUE, ...)
tableNominal(vars, weights = NA, subset = NA, group = NA, miss.cat = NA, print.pval = c("none", "fisher", "chi2"), pval.bound = 10^-4, fisher.B = 2000, vertical = TRUE, cap = "", lab = "", col.tit.font = c("bf", "", "sf", "it", "rm"), font.size = "footnotesize", longtable = TRUE, nams = NA, cumsum = TRUE, ...)
vars |
A data frame of nominal variables. See |
weights |
Optional vector of weights of each observation. |
subset |
Optional logical vector, indicates subset of observations to be used. |
group |
Optional grouping variable. |
miss.cat |
Vector specifying the factors in |
print.pval |
Add |
pval.bound |
|
fisher.B |
Number of simulations to compute |
vertical |
If |
cap |
The caption of the resulting LaTeX table. |
lab |
The label of the resulting LaTeX table. |
col.tit.font |
Choose the font for the column titles here (default: boldface). |
font.size |
Font size for the generated table in LaTeX. |
longtable |
If |
nams |
A vector of strings, containing the names corresponding to the variables in |
cumsum |
If |
... |
Arguments pass through to |
Outputs the LaTeX table.
If either one of the arguments group
, weights
, or subset
is different from NA
and if vars
is a list, then it is assumed that all variables
in vars
are of equal length.
If longtable = TRUE
(which is the default), the function generates a table that may be more than one page
long, you need to include the package longtable in the LaTeX source.
If a list of variables is given to vars
, not all of these variables need to be of the same length. However,
note the Warning above.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
Rufibach, K. (2009)
reporttools: R-Functions to Generate LaTeX Tables of Descriptive Statistics.
Journal of Statistical Software, Code Snippets, 31(1).
doi:10.18637/jss.v031.c01.
data(CO2) vars <- CO2[, 1:2] group <- CO2[, "Treatment"] weights <- c(rep(1, 60), rep(0, 10), rep(2, 14)) ## first all observations tableNominal(vars = vars, weights = weights, group = group, cap = "Table of nominal variables.", lab = "tab: nominal") ## do not include cumulative percentages tableNominal(vars = vars, weights = weights, group = group, cap = "Table of nominal variables.", lab = "tab: nominal", cumsum = FALSE) ## but include p-value for Fisher's exact test tableNominal(vars = vars, weights = weights, group = group, cap = "Table of nominal variables.", lab = "tab: nominal", print.pval = "fisher", cumsum = FALSE) ## Fisher's exact test without simulated p-value tableNominal(vars = vars, weights = weights, group = group, cap = "Table of nominal variables.", lab = "tab: nominal", print.pval = "fisher", fisher.B = Inf, cumsum = FALSE) ## then only consider a subset of observations subset <- c(1:50, 60:70) tableNominal(vars = vars, weights = weights, subset = subset, group = group, cap = "Table of nominal variables.", lab = "tab: nominal") ## do not include cumulative percentages tableNominal(vars = vars, weights = weights, subset = subset, group = group, cap = "Table of nominal variables.", lab = "tab: nominal", cumsum = FALSE) ## Not run: ## caption placement at the top and repeat column headings on top of each page ## in the longtable format. Have to manually add another backslash to hline and endhead ## below (they are removed when compiling the help file)! tableNominal(vars = vars, cap = "Table of nominal variables.", cumsum = FALSE, caption.placement = "top", longtable = TRUE, add.to.row = list(pos = list(0), command = "\hline \endhead ") ## End(Not run)
data(CO2) vars <- CO2[, 1:2] group <- CO2[, "Treatment"] weights <- c(rep(1, 60), rep(0, 10), rep(2, 14)) ## first all observations tableNominal(vars = vars, weights = weights, group = group, cap = "Table of nominal variables.", lab = "tab: nominal") ## do not include cumulative percentages tableNominal(vars = vars, weights = weights, group = group, cap = "Table of nominal variables.", lab = "tab: nominal", cumsum = FALSE) ## but include p-value for Fisher's exact test tableNominal(vars = vars, weights = weights, group = group, cap = "Table of nominal variables.", lab = "tab: nominal", print.pval = "fisher", cumsum = FALSE) ## Fisher's exact test without simulated p-value tableNominal(vars = vars, weights = weights, group = group, cap = "Table of nominal variables.", lab = "tab: nominal", print.pval = "fisher", fisher.B = Inf, cumsum = FALSE) ## then only consider a subset of observations subset <- c(1:50, 60:70) tableNominal(vars = vars, weights = weights, subset = subset, group = group, cap = "Table of nominal variables.", lab = "tab: nominal") ## do not include cumulative percentages tableNominal(vars = vars, weights = weights, subset = subset, group = group, cap = "Table of nominal variables.", lab = "tab: nominal", cumsum = FALSE) ## Not run: ## caption placement at the top and repeat column headings on top of each page ## in the longtable format. Have to manually add another backslash to hline and endhead ## below (they are removed when compiling the help file)! tableNominal(vars = vars, cap = "Table of nominal variables.", cumsum = FALSE, caption.placement = "top", longtable = TRUE, add.to.row = list(pos = list(0), command = "\hline \endhead ") ## End(Not run)
This function generates a one-column matrix, containing strings of assignments of the variables in a data frame.
transformVarNames(dat, name)
transformVarNames(dat, name)
dat |
Dataframe. |
name |
Name of data frame. |
One-column matrix of strings containing the assignments.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
labpar1 <- rnorm(50) labor.param2 <- rgamma(50, 2, 1) dat <- data.frame(labpar1, labor.param2) transformVarNames(dat, name = "dat")
labpar1 <- rnorm(50) labor.param2 <- rgamma(50, 2, 1) dat <- data.frame(labpar1, labor.param2) transformVarNames(dat, name = "dat")
This function generates a one-column matrix, containing strings of assignments of the variables in a
data frame, to be used with with
in plyr, e.g.
transformVarNames2(nams)
transformVarNames2(nams)
nams |
Variable names, typically |
One-column matrix of strings containing the assignments.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
labpar1 <- rnorm(50) labor.param2 <- rgamma(50, 2, 1) dat <- data.frame(labpar1, labor.param2) transformVarNames2(colnames(dat))
labpar1 <- rnorm(50) labor.param2 <- rgamma(50, 2, 1) dat <- data.frame(labpar1, labor.param2) transformVarNames2(colnames(dat))
For each column of a dataframe, generate a row in a resulting table that contains basic descriptive statistics, effect size, -value,
and confidence intervals for a two group comparions, where the grouping variable is separately given.
twoGroupComparisons(vars, v0, conf.level = 0.95, paired = FALSE)
twoGroupComparisons(vars, v0, conf.level = 0.95, paired = FALSE)
vars |
Dataframe of continuous variables. |
v0 |
Binary variable that builds the two groups. |
conf.level |
Confidence level used in computation of confidence intervals. |
paired |
Logical, indicate whether comparisons are paired or not. |
A list consisting of the following elements:
raw |
Matrix that contains the above as raw numbers. |
formatted |
The same table where numbers are formatted and confidence intervals are given as character string. |
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
set.seed(1977) v0 <- round(runif(200, 0, 1)) v1 <- rnorm(200) v2 <- rgamma(200, 2, 1) twoGroupComparisons(vars = data.frame(v1, v2), v0)
set.seed(1977) v0 <- round(runif(200, 0, 1)) v1 <- rnorm(200) v2 <- rgamma(200, 2, 1) twoGroupComparisons(vars = data.frame(v1, v2), v0)
Transform a given string of variable names, separated by ", ", into a vector of corresponding variable names.
varNamesToChar(varnam)
varNamesToChar(varnam)
varnam |
Character string, where variable names are separated by commas. |
Vector of variable names.
Kaspar Rufibach (maintainer), [email protected],
http://www.kasparrufibach.ch
nams <- "var1, var2, var3" varNamesToChar(nams)
nams <- "var1, var2, var3" varNamesToChar(nams)