This is a follow up on my last question (Rbinding large list of dataframes after I did some data cleaning on the list). I've gotten smarter and the former question got messy.
I have 43 xlsx files which I loaded in to a list in R:
file.list <- list.files(recursive=T,pattern='*.xlsx')
dat = lapply(file.list, function(i){
x = read_xlsx(i, sheet=1, col_names = T)
# Create column with file name
x$file = i
# Return data
x
})
I then added some column names:
my_names <- c("ID", "UDLIGNNR","BILAGNR", "AKT", "BA",
"IART", "HTRANS", "DTRANS", "BELOB", "REGD",
"BOGFD", "VALORD", "UDLIGND",
"UÅ", "AFSTEMNGL", "NRBASIS","FIBILAG", "FILE")
dat <- lapply(dat, setNames, my_names)
Then I removed some columns:
dat <- lapply(dat, function(x) { x["UÅ"] <- NULL; x })
dat <- lapply(dat, function(x) { x["FIBILAG"] <- NULL; x })
I don't really need to remove them, but when I tried to merge the dataframes I kept getting errors about the class of these. So I just removed them.
Then I changes all my columns to character. I'm kinda new to R, so I'm aware that this code isn't very sexy and you probably could have made a loop or just one function for this. But this is what I did:
dat <- lapply(dat, function(x) { x["ID"] <- as.character(x["ID"]); x })
[I did the same for all columns]
And then I go and bind the data.
df <- rbindlist(dat)
EDIT:
I've found that it isn't my binding methods that's the problem (thank you for your inputs on that). I've deleted the part about the binding methods.
The problem lies in how I change the coltypes on my columns in my dataframes within the list.
I've also tried:
dat <- lapply(dat, function(x) { x[,"ID"] <- as.character(x[,"ID"]); x })
I added a comma before "ID". This didn't help. I feel like I need to use unlist, but I'm not sure how to use it here?