How to make R's read_csv2() recognise the text

2020-02-14 06:25发布

问题:

Trying to read a csv file using read_csv2() from readr package.

The problem is read_csv2() doesn't recognise the characters properly while R's default read.csv2 successfully does.

For example:

the original value: KOZYATAĞI

how read_csv2() recognises: KOZYATA<'d0'>I

I have checked the help file and also tried below listed coding; however couldn't make it happen.

1st try: ended up with wrong characters

my_df <- read_csv2("my_path/my_file.csv")

2nd try: manually state the encoding.

my_df <- read_csv2("my_path/my_file.csv", locale(encoding = "UTF-8"))

Error: col_names must be TRUE, FALSE or a character vector

3rd try: additions to 2nd try because of the error message above.

my_df <- read_csv2("my_path/my_file.csv", locale(encoding = "UTF-8"), col_names = TRUE, col_types = NULL)

This one doesn't give error but still doesn't recognise the characters properly.

How to do it? Let me know if any other info needed. Thanks in advance.

回答1:

@Amit, thanks for your suggestion.

On the RStudio, I have selected File\Save with Encoding... to see some of the available encoding options.

On the popped up window at the top of the encoding list it shows the system default (which is CP1254 for my computer). Then passed it to the encoding parameter as below and it worked!

my_df <- read_csv2("my_path/my_file.csv", locale(encoding = "CP1254"), col_names = TRUE, col_types = NULL)



回答2:

Now on Rstudio, looks like the options like this, R Studio Version 1.2.1335 on Windows 10

then, this code works:

read_csv("C:path/file.csv", locale(encoding = "ISO-8859-1"),col_names = TRUE,col_types = NULL)

And spanish special caracters loads correctly (accents and ñ).