Row and column sums in R

2019-07-03 18:57发布

This is an example of how my data set (MergedData) looks like in R, where each of my participants (5 rows) obtained a score number in every test (7 columns). I would like to know the total score of all tests combined (all columns) but for each participant (row).

Also, my complete data set has more than just these few variables, so if possible, I would like do it using a formula & loop and not having to type row by row/column by column.

Participant TestScores     
ParticipantA    2   4   2   3   2   3   4
ParticipantB    1   3   2   2   3   3   3
ParticipantC    1   4   4   2   3   4   2
ParticipantD    2   4   2   3   2   4   4
ParticipantE    1   3   2   2   2   2   2

I have tried this but it doesn't work:

Test_Scores <- rowSums(MergedData[Test1, Test2, Test3], na.rm=TRUE)

I get the following error-message:

Error in `[.data.frame`(MergedData, Test1, Test2, Test3,  : 
  unused arguments

How do I solve this? Thank you!!

标签: r rowsum
6条回答
祖国的老花朵
2楼-- · 2019-07-03 19:21

For small data, it might be interesting to convert the data.frame to a table then use addmargins().

With this sample data

MergedData<-data.frame(Participant=letters[1:5],
    Test1 = c(2,1,1,2,1),
    Test2 = c(4,3,4,4,3),
    Test3 = c(2,2,4,2,2),
    Test4 = c(3,2,2,3,2),
    Test5 = c(2,3,3,2,2)
)

and this helper function

as.table.data.frame<-function(x, rownames=0) {
    numerics <- sapply(x,is.numeric)
    chars <- which(sapply(x,function(x) is.character(x) || is.factor(x)))
    names <- if(!is.null(rownames)) {
        if (length(rownames)==1) {
            if (rownames ==0) {
                 rownames(x)
            } else {
                as.character(x[,rownames])
            }
        } else {
            rownames
        }
    } else {
          if(length(chars)==1) {
            as.character(x[,chars])
        } else {
            rownames(x)
        }
    }
    x<-as.matrix(x[,numerics])
    rownames(x)<-names
    structure(x, class="table")
}

you could do

addmargins(as.table(MergedData))

to get

    Test1 Test2 Test3 Test4 Test5 Sum
a       2     4     2     3     2  13
b       1     3     2     2     3  11
c       1     4     4     2     3  14
d       2     4     2     3     2  13
e       1     3     2     2     2  10
Sum     7    18    12    12    12  61

Probably not super useful in this case, but a fun use of addmargins nonetheless.

查看更多
Ridiculous、
3楼-- · 2019-07-03 19:24

I think you want this:

rowSums(MergedData[,c('Test1', 'Test2', 'Test3')], na.rm=TRUE)
查看更多
乱世女痞
4楼-- · 2019-07-03 19:24

Four previous answers and only one showing a result? What's up with that? Here's one

> dat <- read.table(header=T, text = 
  'Participant Test1 Test2 Test3 Test4 Test5 Test6 Test7     
  ParticipantA    2   4   2   3   2   3   4
  ParticipantB    1   3   2   2   3   3   3
  ParticipantC    1   4   4   2   3   4   2
  ParticipantD    2   4   2   3   2   4   4
  ParticipantE    1   3   2   2   2   2   2')

You wrote that

"...if possible, I would like do it using a formula & loop and not having to type row by > row/column by column"

You won't have to write any loops at all. The row and column functions operate on all the row and all the columns, with no looping.

> rowSums(dat[-1], na.rm = TRUE)
## [1] 20 17 20 21 14
> colSums(dat[-1], na.rm = TRUE)
##  Test1  Test2  Test3  Test4  Test5  Test6  Test7 
##      7     18     12     12     12     16     15 
查看更多
做自己的国王
5楼-- · 2019-07-03 19:30

Please consult the documentation for ?rowSumsand ?colSums.

It's not clear from your post exactly what MergedData is. Assuming it's a data.frame, the problem is your indexing MergedData[Test1, Test2, Test3]. If it is a data.frame, you'd like to run something like:

Test_Scores <- rowSums(MergedData, na.rm = TRUE)

or

Test_Scores <- rowSums(MergedData[, c("Test1", "Test2", "Test3")], na.rm = TRUE)

if you only want to use the columns named "Test1", "Test2", and "Test3" (if they indeed are named so).

If this doesn't work. Please show us the output of str(MergedData).

You need to provide a minimal reproducible example of the error to get any really helpful answers.

查看更多
Root(大扎)
6楼-- · 2019-07-03 19:41

You could use:

MergedData$Test_Scores_Sum <- rowSums(MergedData[,2:8], na.rm=TRUE)

Where 2:8 are all the columns (tests) you want to sum up. This way it will create another column in your data.

This way you dont have to type each column name and you can still have other columns in you data frame which will not be summed up. Note however, that all columns of tests you want to sum up should be beside each other (as in your example data).

查看更多
爷的心禁止访问
7楼-- · 2019-07-03 19:44

Here's a way to do it with dplyr and reshape2:

dat <- read.table(header=T, text = 
                    'Participant Test1 Test2 Test3 Test4 Test5 Test6 Test7     
  ParticipantA    2   4   2   3   2   3   4
  ParticipantB    1   3   2   2   3   3   3
  ParticipantC    1   4   4   2   3   4   2
  ParticipantD    2   4   2   3   2   4   4
  ParticipantE    1   3   2   2   2   2   2')

library(dplyr) 
library(reshape2)    

# Melt data into long format
dat.l = melt(dat, id.var="Participant", variable.name="Test")    
> dat.l
    Participant  Test value
1  ParticipantA Test1     2
2  ParticipantB Test1     1
3  ParticipantC Test1     1
4  ParticipantD Test1     2
...
32 ParticipantB Test7     3
33 ParticipantC Test7     2
34 ParticipantD Test7     4
35 ParticipantE Test7     2

# Sum by Participant
dat.l %.%
  group_by(Participant) %.%
  summarise(Sum=sum(value))

   Participant Sum
1 ParticipantA  20
2 ParticipantB  17
3 ParticipantC  20
4 ParticipantD  21
5 ParticipantE  14

# Sum by Test
dat.l %.%
  group_by(Test) %.%
  summarise(Sum=sum(value))

   Test Sum
1 Test1   7
2 Test2  18
3 Test3  12
4 Test4  12
5 Test5  12
6 Test6  16
7 Test7  15
查看更多
登录 后发表回答