-->

R read.table(), how can I read the header but also

2019-02-02 06:09发布

问题:

Data.txt:

Index;Time;
1;2345;
2;1423;
3;5123;

The code:

dat <- read.table('data.txt', skip = 1, nrows = 2, header =TRUE, sep =';')

The result:

  X1 X2345
1  2  1423
2  3  5123

I expect the header to be Index and Time, as follows:

  Index Time
1   2   1423
2   3   5123

How do I do that?

回答1:

I am afraid, that there is no direct way to achieve this. Either you read the entire table and remove afterwards the lines you don't want or you read in the table twice and assign the header later:

header <- read.table('data.txt', nrows = 1, header = FALSE, sep =';', stringsAsFactors = FALSE)
dat    <- read.table('data.txt', skip = 2, header = FALSE, sep =';')
colnames( dat ) <- unlist(header)


回答2:

You're using skip incorrectly. Try this:

dat <- read.table('data.txt', nrows = 2, header =TRUE, sep =';')[-1, ]


回答3:

The solution using fread from data.table.

require(data.table)
fread("Data.txt", drop = "V3")[-1]

Result:

> fread("Data.txt", drop = "V3")[-1]
   Index Time
1:     2 1423
2:     3 5123


回答4:

Instead of read.table(), use a readr function such as read_csv(), piped to dplyr::slice().

library(readr)
library(dplyr)
dat <- read_csv("data.txt") %>% slice(-1)

It's very fast too.



回答5:

You could (in most cases), sub out the ending ; write a new file without the second row (which is really the first row because of the header), and use read.csv instead of read.table

> txt <- "Index;Time;
  1;2345;
  2;1423;
  3;5123;" 
> writeLines(sub(";$", "", readLines(textConnection(txt))[-2]), 'newTxt.txt')
> read.csv('newTxt.txt', sep = ";")
##   Index Time
## 1     2 1423
## 2     3 5123