Unexpected behavior of kable when called from lapp

2020-07-09 08:21发布

问题:

I am trying to understand the two following unexpected behaviors of the kable function when knitting HTML using the knitr package (in RStudio 0.98.977 on Ubuntu 14.04):

  1. When two calls of kable are made from within lapply, only the first call produces a pretty display in the final HTML.
  2. When two calls of kable are made from within a function that also uses print statements, only the last call produces a pretty display in the final HTML.

An example code is written below:

Load library:

```{r init}
library("knitr")
```

Define dataframe:

```{r define_dataframe}
df <- data.frame(letters=c("a", "b", "c"), numbers=c(1, 2, 3))
rownames(df) <- c("x", "y", "z")
```

### Example 1: pretty display with simple call

The dataframe is displayed nicely twice when knitting HTML with the following code:

```{r pretty_display1, results="asis"}
kable(df)
kable(df)
```

### Example 2: unexpected display with lapply

The dataframe is displayed nicely only the first time when knitting HTML with the following code:

```{r unexpected_display1, results="asis"}
lst <- list(df, df)
lapply(lst, kable)
```

### Example 3: pretty display with function

The dataframe is displayed nicely twice when knitting HTML with the following code:

```{r pretty_display2, results="asis"}
foo1 <- function (df) {
  kable(df)
}
foo2 <- function (df) {
  foo1(df)
  foo1(df)
}
foo2(df)
```

### Example 4: unexpected display with function containing print statements

The dataframe is displayed nicely only the second time when knitting HTML with the following code:

```{r unexpected_display2, results="asis"}
foo1 <- function (df) {
  kable(df)
}
foo2 <- function (df) {
  print("first display")
  foo1(df)
  print("second display")
  foo1(df)
}
foo2(df)
```

Do you have an explanation to these strange behaviors and how to circumvent them?

回答1:

The output of kable is a side-effect; you can store the value of the output in a variable, but just running kable will output something to console. When you run kable(df) twice, this isn't a problem, you aren't storing anything, and the function dumps the output to the console twice.

However, when you run lapply(lst, kable), the function dumps the output to the console and then the value of the list is displayed. Try running this just in your console:

lst <- list(df, df)
lapply(lst, kable)

You should get exactly this:

|   |letters | numbers|
|:--|:-------|-------:|
|x  |a       |       1|
|y  |b       |       2|
|z  |c       |       3|


|   |letters | numbers|
|:--|:-------|-------:|
|x  |a       |       1|
|y  |b       |       2|
|z  |c       |       3|
[[1]]
[1] "|   |letters | numbers|" "|:--|:-------|-------:|"
[3] "|x  |a       |       1|" "|y  |b       |       2|"
[5] "|z  |c       |       3|"

[[2]]
[1] "|   |letters | numbers|" "|:--|:-------|-------:|"
[3] "|x  |a       |       1|" "|y  |b       |       2|"
[5] "|z  |c       |       3|"

Notice how the correct markdown is output, and then the actual value of the list you created is displayed. This is what creates the bad output.

The functional paradigm doesn't work particularly well with side-effects, so you have a couple of options. You can store the results of kable by setting the output parameter to FALSE, or you can just use a for to go through your list, or you could prevent the displaying of the result list. here are some examples that will work.

```{r nograpes1, results="asis"}
lst <- list(df, df)
for(x in lst) kable(x) # Don't create a list, just run the function over each element
```

```{r nograpes2, results="asis"}
lst <- list(df, df)
invisible(lapply(lst, kable)) # prevent the displaying of the result list.
```

```{r nograpes3, results="asis"}
lst <- list(df, df)
l <- lapply(lst, kable) # Store the list and do nothing with it.
```

In my opinion, this is a nice example of when for should be used in R, as it most cleanly expresses how you want to use a side-effect based function.



回答2:

nograpes had presented a nice answer to your question about the lapply. Here I'm trying to address the other part of your question.

The print function has its side effect by printing a string to the output. The first print function is called immediately after the kable call, which appends a row after the RMarkdown table syntax. Because the RMarkdown syntax requires you to have newlines before and after the table, your print function contaminates the kable output. Hence the first kable output is not correctly parsed into a table.

You can print the raw RMarkdown output if you remove the results="asis" part from the chunk, in order to see the culprint:

## [1] "first display"
## 
## 
## |   |letters | numbers|
## |:--|:-------|-------:|
## |x  |a       |       1|
## |y  |b       |       2|
## |z  |c       |       3|
## [1] "second display"  # <- here is the culprit! 
## 
## 
## |   |letters | numbers|
## |:--|:-------|-------:|
## |x  |a       |       1|
## |y  |b       |       2|
## |z  |c       |       3|

You can clearly see how the second display is appended immediately after the table, interfering with the Markdown processing.

If you really want to print some strings/information to the output, you can utilize the cat function. But do remember to print some new lines so that the RMarkdown table syntax is not smeared.

```{r unexpected_display2, results="asis"}
foo1 <- function (df) {
  kable(df)
}
foo2 <- function (df) {
  cat("\n\nfirst display")
  foo1(df)
  cat("\n\nsecond display")
  foo1(df)
}
foo2(df)
```