I am using the R package stargazer to generate tables in Latex. It works great but I cannot figure out how to format my numbers correctly. I want all numbers to show exactly one decimal place (e. g. 1.0, 0.1, 10.5 etc.). I therefore use the option digits = 1. However, for exact numbers like 1 this gives me 1 instead of 1.0. How can I get a decimal place even for exact numbers (1.0 instead of 1)?
问题:
回答1:
You can use regex to add the decimal places back after using stargazer. Here is an example. You may need to change the regex string slightly, depending on the type of summary you are generating with stargazer, but since no minimal example is included in the question, the best I can do is give a generic example of this method:
star = stargazer(attitude, digits=1, digits.extra=1)
star = gsub("& ([0-9]+) ", "& \\1\\.0 ", star)
cat(star, sep = "\n")
# % Table created by stargazer v.5.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu
# % Date and time: Sat, Oct 08, 2016 - 8:11:26 PM
# \begin{table}[!htbp] \centering
# \caption{}
# \label{}
# \begin{tabular}{@{\extracolsep{5pt}}lccccc}
# \\[-1.8ex]\hline
# \hline \\[-1.8ex]
# Statistic & \multicolumn{1}{c}{N} & \multicolumn{1}{c}{Mean} & \multicolumn{1}{c}{St. Dev.} & \multicolumn{1}{c}{Min} & \multicolumn{1}{c}{Max} \\
# \hline \\[-1.8ex]
# rating & 30.0 & 64.6 & 12.2 & 40.0 & 85.0 \\
# complaints & 30.0 & 66.6 & 13.3 & 37.0 & 90.0 \\
# privileges & 30.0 & 53.1 & 12.2 & 30.0 & 83.0 \\
# learning & 30.0 & 56.4 & 11.7 & 34.0 & 75.0 \\
# raises & 30.0 & 64.6 & 10.4 & 43.0 & 88.0 \\
# critical & 30.0 & 74.8 & 9.9 & 49.0 & 92.0 \\
# advance & 30.0 & 42.9 & 10.3 & 25.0 & 72.0 \\
# \hline \\[-1.8ex]
# \end{tabular}
# \end{table}
In this example, the pattern "& ([0-9]+) " looks for "& " followed by a string of digits, followed by a space. It then replaces this with "& ", the same group of digits it found (using //1), a period (//. because periods must be escaped, as they are special characters in regex), a decimal zero and a space.
Some different summary formats produced by stargazer could possibly have other things you may need to include in the search string, such as numbers followed by a character other than a space (e.g. a comma). Or maybe numbers that aren't preceded by an &
might need replacing in some instances.
In any case the general approach is the same.
回答2:
Thanks for pushing towards a more integrated answer, wolfsatthedoor. I was wondering about this myself for quite some time and it appears that the stargazer code is deliberately written like this.
Looking into the source code with
trace(stargazer:::.stargazer.wrap, edit = T)
reveals at line ~4485 (may be slightly different depending on the exact version) that .summ.stat.publish.statistic
checks whether the returned value .is.all.integers.
If that is the case the final value is rounded with 0 digits (hard-coded).
else if (which.statistic == "median") {
median.value <- median(temp.var, na.rm = TRUE)
if (.is.all.integers(temp.var) == FALSE) {
how.much.to.round <- .format.s.round.digits
}
else {
if (.is.all.integers(median.value) == TRUE) {
how.much.to.round <- 0
}
else {
how.much.to.round <- 1
}
}
return(.iround(median.value, how.much.to.round))
}
To change this behaviour you would have to change all how.much.to.round
to .format.s.round.digits
, which is the value specified with the digits command. You would have to do that for all summary statistics separately, i.e., median, min, max, and p, in lines ~4510 to 4570. This ensures also that the column N does not carry unnecessary digits.
The saved custom stargazer function should then behave as follows (attention: everytime you restart the R-Session you would have to re-do the changes - you can prevent this by exporting the function):
stargazer((cbind(A = c(1,1,1,1), B = c(3,4,3,3))), summary = T, digits = 2, header = F, type = "text")
# ===================================
# Statistic N Mean St. Dev. Min Max
# -----------------------------------
# A 4 1.00 0.00 1.00 1.00
# B 4 3.25 0.50 3.00 4.00
# -----------------------------------
stargazer((cbind(A = c(1,1,1,1), B = c(3,4,3,3))), summary = T, digits = 2, header = F)
# \begin{table}[!htbp] \centering
# \caption{}
# \label{}
# \begin{tabular}{@{\extracolsep{5pt}}lccccc}
# \\[-1.8ex]\hline
# \hline \\[-1.8ex]
# Statistic & \multicolumn{1}{c}{N} & \multicolumn{1}{c}{Mean} & \multicolumn{1}{c} {St. Dev.} & \multicolumn{1}{c}{Min} & \multicolumn{1}{c}{Max} \\
# \hline \\[-1.8ex]
# A & 4 & 1.00 & 0.00 & 1.00 & 1.00 \\
# B & 4 & 3.25 & 0.50 & 3.00 & 4.00 \\
# \hline \\[-1.8ex]
# \end{tabular}
# \end{table}
回答3:
(Before openning the trace of stargazer) If you want to show all the digits specified in the script just replace e.g:
else if (which.statistic == "min") {
if (.is.all.integers(temp.var) == FALSE) {
how.much.to.round <- .format.s.round.digits
}
else {
how.much.to.round <- .format.s.round.digits
}
return(.iround(min(temp.var, na.rm = TRUE),
how.much.to.round))
}
replace for that :
else if (which.statistic == "min") {
return(.iround(min(temp.var, na.rm = TRUE), .format.s.round.digits))
}
You can replace min for max etc. This is approximately in the lines 4530-4576