I need to read a data file into R for my assignment. You can download it from the following site.
http://archive.ics.uci.edu/ml/datasets/Acute+Inflammations
The data file ends with an extension .data which I never see before. I tried read.table and alike but could not read it into R properly. Can anyone help me with this, please?
You have a UTF-16LE file, a.k.a Unicode on Windows (in case you're on that os). Try this
Though trying what @Gavin Simpson said might help, as you can add your headings and save the file
It's a UTF-16 little endian file with a byte order mark at the beginning.
read.table
will fail unless you specify the correct encoding. This works for me on MacOS. Decimals are indicated by a comma.From your link:
Thus you need to use
read.table()
withsep = "\t"
Also looks like it uses a comma for the decimal, so also specify
dec = ","
insideread.table()
.It looks like you'll need to put in the column headings manually, though your link defines them.
Make sure you see @Gavin Simpson's comment below to clean up other undocumented "features" of this dataset.