I'm attempting to merge multiple csv files using R. all of the CSV files have the same fields and are all a shared folder only containing these CSV files. I've attempted to do it using the following code:
multmerge=function(mypath) {
filenames=list.files(path=mypath, full.names=TRUE)
datalist= lapply(filenames, function (x) read.csv(file=x, header=True))
Reduce(function(x,y) merge(x,y), datalist)}
I am entering my path as something like "Y:/R Practice/specdata".
I do get an ouput when I apply the function to my 300 or so csv files, but the result gives me my columns names, but beneath it has <0 rows> (or 0-length row.names).
Please let me know if you have any suggestions on why this isn't working and how I can fix it.
For a shorter, faster solution
library(dplyr)
library(readr)
df <- list.files(full.names = TRUE) %>%
lapply(read_csv) %>%
bind_rows
Your code worked for me, but you need change header = True
to header = TRUE
.
If all your csv files have exactly the same fields (column names) and you want simply to combine them vertically, you should use rbind
instead of merge
:
> a
A B
[1,] 2.471202 38.949232
[2,] 16.935362 6.343694
> b
A B
[1,] 0.704630 0.1132538
[2,] 4.477572 11.8869057
> rbind(a, b)
A B
[1,] 2.471202 38.9492316
[2,] 16.935362 6.3436939
[3,] 0.704630 0.1132538
[4,] 4.477572 11.8869057
Another option that has proved to work for my setup:
multmerge = function(path){
filenames=list.files(path=path, full.names=TRUE)
rbindlist(lapply(filenames, fread))
}
path <- "Dropbox/rstudio-share/dataset/MB"
DF <- multmerge(path)
If you need a much granular control of your CSV file during the loading process you can change the fread
by a function like so:
multmerge = function(path){
filenames=list.files(path=path, full.names=TRUE)
rbindlist(lapply(filenames, function(x){read.csv(x, stringsAsFactors = F, sep=';')}))
}