How to add header to a dataset in R?

2019-04-03 12:03发布

I need to read the ''wdbc.data' in the following data folder: http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/

Doing this in R is easy using command read.csv but as the header is missing how can I add it? I have the information but don't know how to do this and I'd prefer do not edit the data file.

3条回答
欢心
2楼-- · 2019-04-03 12:45

in case you are interested in reading some data from a .txt file and only extract few columns of that file into a new .txt file with a customized header, the following code might be useful:

# input some data from 2 different .txt files:
civit_gps <- read.csv(file="/path2/gpsFile.csv",head=TRUE,sep=",")
civit_cam <- read.csv(file="/path2/cameraFile.txt",head=TRUE,sep=",")

# assign the name for the output file:
seqName <- "seq1_data.txt"

#=========================================================
# Extract data from imported files
#=========================================================
# From Camera:
frame_idx <- civit_cam$X.frame
qx        <- civit_cam$q.x.rad.
qy        <- civit_cam$q.y.rad.
qz        <- civit_cam$q.z.rad.
qw        <- civit_cam$q.w

# From GPS:
gpsT      <- civit_gps$X.gpsTime.sec.
latitude  <- civit_gps$Latitude.deg.
longitude <- civit_gps$Longitude.deg.
altitude  <- civit_gps$H.Ell.m.
heading   <- civit_gps$Heading.deg.
pitch     <- civit_gps$pitch.deg.
roll      <- civit_gps$roll.deg.
gpsTime_corr <- civit_gps[frame_idx,1]

#=========================================================
# Export new data into the output txt file
#=========================================================
myData <- data.frame(c(gpsTime_corr),
                     c(frame_idx),
                     c(qx),
                     c(qy),
                     c(qz),
                     c(qw))
# Write :
cat("#GPSTime,frameIdx,qx,qy,qz,qw\n", file=seqName)
write.table(myData, file = seqName,row.names=FALSE,col.names=FALSE,append=TRUE,sep = ",")

Of course, you should modify this sample script based on your own application.

查看更多
神经病院院长
3楼-- · 2019-04-03 12:54

You can do the following:

Load the data:

test <- read.csv(
          "http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data",
          header=FALSE)

Note that the default value of the header argument for read.csv is TRUE so in order to get all lines you need to set it to FALSE.

Add names to the different columns in the data.frame

names(test) <- c("A","B","C","D","E","F","G","H","I","J","K")

or alternative and faster as I understand (not reloading the entire dataset):

colnames(test) <- c("A","B","C","D","E","F","G","H","I","J","K")
查看更多
4楼-- · 2019-04-03 13:08

You can also use colnames instead of names if you have data.frame or matrix

查看更多
登录 后发表回答