We have a problem using RMarkdown on multiple operating systems.
Initially, an .Rmd file is created on a Linux system (Ubuntu 12.04 LTS) and then pushed to a GitHub repo.
It can be compiled ("knitted") without problems on this system.
It is then pulled on a Windows 7 machine with RStudio installed.
There, when trying to compile, the following error shows up:
Error in yaml::yaml.load(front_matter) :
Reader error: invalid leading UTF-8 octet: #FC at 66
Calls: <Anonymous> -> parse_yaml_front_matter -> <Anonymous> -> .Call
Execution halted
- When creating another .Rmd file on the Windows system, it works flawlessly.
- When creating another .Rmd file on the Windows system, and copying everything but the first few lines of the "problematic" file to the other .Rmd file, and compiling this file, it works flawlessly.
I compared both files in HEX (in Sublime) on both operating systems: They are EXACTLY the same.
Has somebody else seen that error before?
Update: It seems as if a German Umlaut ("ü") is causing the problem, as its UTF-8 "Escaped Unicode" is \uFC, according to http://www.endmemo.com/unicode/unicodeconverter.php
In general, it seems that Unicode is not correctly recognized by either R, RStudio or knitr on Windows. When I type in some Umlauts in a new .Rmd file, and knit it, I get output such as "öää". In RStudio > Tools > Global options, I set the Default text encoding to "UTF-8". And I also did that for R, in the RProfile.site file (options(encoding="UTF-8")
).
Update 2: library(rmarkdown); sessionInfo()
gives
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 LC_MONETARY=German_Switzerland.1252
[4] LC_NUMERIC=C LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rmarkdown_0.4.2
loaded via a namespace (and not attached):
[1] digest_0.6.8 htmltools_0.2.6 tools_3.1.2
on Windows 7, whereas, on Ubuntu, it is:
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rmarkdown_0.3.10
loaded via a namespace (and not attached):
[1] digest_0.6.8 htmltools_0.2.6 tools_3.1.2
I already suspect the problem to be the diverging locale... how do I fix this?