R: Transpose the a results table and add column he

2019-09-09 19:29发布

问题:

Setting the scene:

So I have a directory with 50 .csv files in it.

All files have unique names e.g. 1.csv 2.csv ...

The contents of each may vary in the number of rows but always have 4 columns

The column headers are:

  • Date
  • Result 1
  • Result 2
  • ID

I want them all to be merged together into one dataframe (mydf) and then I'd like to ignore any rows where there is an NA value.

So that I can count how many complete instances of an "ID" there were. By calling for example;

  • myfunc("my_files", 1)
  • myfunc("my_files", c(2,4,6))

My code so far:

myfunc <- function(directory, id = 1:50) {
        files_list <- list.files(directory, full.names=T)
        mydf <- data.frame()
        for (i in 1:50) {
                mydf <- rbind(mydf, read.csv(files_list[i]))
        }
        mydf_subset <- mydf[which(mydf[, "ID"] %in% id),]
        mydf_subna <- na.omit(mydf_subset)
        table(mydf_subna$ID)
}

My issues and where I need help:

My results come out this way

2   4    6   
200 400  600

and I'd like to transpose them to be like this. I'm not sure if calling a table is right or should I call it as.matrix perhaps?

2 100
4 400
8 600

I'd also like to have either the headers from the original files or assign new ones

ID Count
2  100
4  400
8  600

Any and all advice is welcome

Matt

Additional update

I tried amending to incorperate some of the helpful comments below, so I also have a set of code that looks like this;

myfunc <- function(directory, id = 1:50) {
        files_list <- list.files(directory, full.names=T)
        mydf <- data.frame()
        for (i in 1:50) {
                mydf <- rbind(mydf, read.csv(files_list[i]))
        }
        mydf_subset <- mydf[which(mydf[, "ID"] %in% id),]
        mydf_subna <- na.omit(mydf_subset)
        result <- data.frame(mydf_subna$ID)
        transposed_result <- t(result)
        colnames(transposed_result) <- c("ID","Count")
}

which I try to call with this:

myfunc("myfiles", 1)
myfunc("myfiles", c(2, 4, 6))

but I get this error

> myfunc("myfiles", c(2, 4, 6))
Error in `colnames<-`(`*tmp*`, value = c("ID", "Count")) : 
  length of 'dimnames' [2] not equal to array extent

I wonder if perhaps I'm not creating this data.frame correctly and should be using a cbind or not summing the rows by ID maybe?

回答1:

You need want to change your function to create a data frame rather than a table and then transpose that data frame. Change the line

table(mydf_subna$ID)

to be instead

result <- data.frame(mydf_subna$ID) 

then use the t() function which transposes your data frame

transposed_result <- t(result) 

colnames(transposed_result) <- c("ID","Count") 


回答2:

Welcome to Stack Overflow.

I am assuming that the function that you have written returns the table which is saved in variable ans.

You may give a try to this code:

ans <- myfunc("my_files", c(2,4,6))

ans2 <- data.frame(ans)

colnames(ans2) <- c('ID' ,'Count')


标签: r rbind r-table