Setting the scene:
So I have a directory with 50 .csv files in it.
All files have unique names e.g. 1.csv 2.csv ...
The contents of each may vary in the number of rows but always have 4 columns
The column headers are:
- Date
- Result 1
- Result 2
- ID
I want them all to be merged together into one dataframe (mydf) and then I'd like to ignore any rows where there is an NA value.
So that I can count how many complete instances of an "ID" there were. By calling for example;
- myfunc("my_files", 1)
- myfunc("my_files", c(2,4,6))
My code so far:
myfunc <- function(directory, id = 1:50) {
files_list <- list.files(directory, full.names=T)
mydf <- data.frame()
for (i in 1:50) {
mydf <- rbind(mydf, read.csv(files_list[i]))
}
mydf_subset <- mydf[which(mydf[, "ID"] %in% id),]
mydf_subna <- na.omit(mydf_subset)
table(mydf_subna$ID)
}
My issues and where I need help:
My results come out this way
2 4 6
200 400 600
and I'd like to transpose them to be like this. I'm not sure if calling a table is right or should I call it as.matrix perhaps?
2 100
4 400
8 600
I'd also like to have either the headers from the original files or assign new ones
ID Count
2 100
4 400
8 600
Any and all advice is welcome
Matt
Additional update
I tried amending to incorperate some of the helpful comments below, so I also have a set of code that looks like this;
myfunc <- function(directory, id = 1:50) {
files_list <- list.files(directory, full.names=T)
mydf <- data.frame()
for (i in 1:50) {
mydf <- rbind(mydf, read.csv(files_list[i]))
}
mydf_subset <- mydf[which(mydf[, "ID"] %in% id),]
mydf_subna <- na.omit(mydf_subset)
result <- data.frame(mydf_subna$ID)
transposed_result <- t(result)
colnames(transposed_result) <- c("ID","Count")
}
which I try to call with this:
myfunc("myfiles", 1)
myfunc("myfiles", c(2, 4, 6))
but I get this error
> myfunc("myfiles", c(2, 4, 6))
Error in `colnames<-`(`*tmp*`, value = c("ID", "Count")) :
length of 'dimnames' [2] not equal to array extent
I wonder if perhaps I'm not creating this data.frame correctly and should be using a cbind or not summing the rows by ID maybe?