Input Chinese characters not correctly echoed in E

2019-08-08 18:10发布

问题:

I had this weird encoding issue for my Emacs and R environment. Display of Chinese characters are all good with my .Rprofile setting Sys.setlocale("LC_ALL","zh_CN.utf-8"); except the echo of input ones.

    > linkTexts[5]
          font 
    "使用帮助" 
    > functionNotExist()
    错误: 没有"functionNotExist"这个函数
    > fire <- "你好"
    > fire
    [1] "  "

As we can see, Chinese characters contained in the vector linkTexts, Chinese error messages, and input Chinese characters all can be perfectly shown, yet the echo of input characters were only shown as blank placeholders.

sessionInfo() is here, which is as expected given the Sys.setlocale("LC_ALL","zh_CN.utf-8"); setting:

    > sessionInfo()
    R version 2.15.2 (2012-10-26)
    Platform: i386-apple-darwin9.8.0/i386 (32-bit)

    locale:
    [1] zh_CN.utf-8/zh_CN.utf-8/zh_CN.utf-8/C/zh_CN.utf-8/C

    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     

    other attached packages:
    [1] XML_3.96-1.1

    loaded via a namespace (and not attached):
    [1] compiler_2.15.2 tools_2.15.2   

And I have no locale settings in the .Emacs file.

To me, this seems to be an Emacs encoding issue, but I just don't know how to correct it. Any idea or suggestion? Thanks.

回答1:

Your examples work for me out of the box. You can set emacs process decoding/encoding with M-x set-buffer-process-coding-system. Once you figure out what encoding works (if any) you can make the change permanent with:

(add-hook 'ess-R-post-run-hook
          (lambda () (set-buffer-process-coding-system
                      'utf-8-unix 'utf-8-unix)))

Replace utf-8-unix with your chosen encoding.

I am not very convinced that the above will help. LinkText in your example displays well, but fire does not, doesn't look like an emacs/ESS issue.



回答2:

VitoshKa has made the perfectly correct suggestion. I just wanna add more of own findings here, as people may meet different but similar special character problems. Yet they can be solved in the same way.

The root cause is the input encoding setting to the current buffer process. As shown by the M-x describe-current-coding-system command, default buffer process encoding setting was good for output (utf-8-unix) but deteriorated for input:

    Coding systems for process I/O:
      encoding input to the process: 1 -- iso-latin-1-unix (alias: iso-8859-1-unix latin-1-unix)

      decoding output from the process: U -- utf-8-unix (alias: mule-utf-8-unix)

Changing the coding system for input into utf-8-unix, either by 'M-x set-buffer-process-coding-system' or adding the ess-post-run-hook into .emacs like suggested by VitoshKa, would suffice for solving the Chinese character display problem.

The other problem people may meet due to this setting is special character in ESS. When trying to input special characters, you may get the error message, 错误: 句法分析器%d行里不能有多字节字符 , or invalid multibyte character in parser at line %d in English.

    > x <- data.frame(part = c("målløs", "ny"))
    错误: 句法分析器1行里不能有多字节字符

And with the correct utf-8-unix setting for input coding system of buffer process, the above error for special characters disappears.