Join two dataframes before exporting as .csv files

2020-07-16 12:29发布

问题:

I am working on a large questionnaire - and I produce summary frequency tables for different questions (e.g. df1 and df2).

a<-c(1:5)
b<-c(4,3,2,1,1)
Percent<-c(40,30,20,10,10) 
df1<-data.frame(a,b,Percent)
c<-c(1,1,5,2,1)
Percent<-c(10,10,50,20,10)
df2<-data.frame(a,c,Percent)
rm(a,b,c,Percent)

I normally export the dataframes as csv files using the following command:

write.csv(df1 ,file="df2.csv")

However, as my questionnaire has many questions and therefore dataframes, I was wondering if there is a way in R to combine different dataframes (say with a line separating them), and export these to a csv (and then ultimately open them in Excel)? When I open Excel, I therefore will have just one file with all my question dataframes in, one below the other. This one csv file would be so much easier than having individual files which I have to open in turn to view the results.

Many thanks in advance.

回答1:

If your end goal is an Excel spreadsheet, I'd look into some of the tools available in R for directly writing an xls file. Personally, I use the XLConnect package, but there is also xlsx and also several write.xls functions floating around in various packages.

I happen to like XLConnect because it allows for some handy vectorization in situations just like this:

require(XLConnect)

#Put your data frames in a single list
# I added two more copies for illustration
dfs <- list(df1,df2,df1,df2)

#Create the xls file and a sheet
# Note that XLConnect doesn't seem to do tilde expansion!
wb <- loadWorkbook("/Users/jorane/Desktop/so.xls",create = TRUE)
createSheet(wb,"Survey")

#Starting row for each data frame
# Note the +1 to get a gap between each
n <- length(dfs)
rows <- cumsum(c(1,sapply(dfs[1:(n-1)],nrow) + 1))

#Write the file
writeWorksheet(wb,dfs,"Survey",startRow = rows,startCol = 1,header = FALSE)
#If you don't call saveWorkbook, nothing will happen
saveWorkbook(wb)

I specified header = FALSE since otherwise it will write the column header for each data frame. But adding a single row at the top in the xls file at the end isn't much additional work.



回答2:

As James commented, you could use

merge(df1, df2, by="a")

but that would combine the data horizontally. If you want to combine them vertically you could use rbind:

rbind(df1, df2, df3,...)

(Note: the column names need to match for rbind to work).