问题:

I was using the skip option in read.csv to skip a few lines before reading into my data frame from a csv file. However, when I do a names(dataframe) upon doing this, I lose my column names and get some random strings as column names. Why does this happen?

> mydf = read.csv("mycsvfile.csv",skip=100)
> names(mydf)
[1] "X2297256" "X3"

Without the skip option, it works fine

> mydf = read.csv("mycsvfile.csv")
> names(mydf)
[1] "col1" "col2"

回答1:

If you skip lines in a file, you skip the complete line, so if your header is in the first line and you skip 100 lines, the header line will be skipped. If you want to skip part of the the file and still keep headers, you'll need to read them separately

headers <- names(read.csv("mycsvfile.csv",nrows=1))
mydf <- read.csv("mycsvfile.csv", header=F, col.names=headers, skip=100)

回答2:

It is not necessary to read in the headers separately. You can do this in one line by using negative indexing on the dataframe, where a negative index means "keep all lines except the negative index (range)".

So if you want to keep the headers and then skip the first N lines you just need to do this:

mydf<-read.csv("mycsvfile.csv",header=T)[-1:-N,]

unable to get column names when using skip along w

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮