得到一个列的位数,其中另一列的值在R是1(Getting Median of a Column wh

2019-09-27 17:43发布

好了,所以我也有类似的这种结构的csv文件

hashID,value,flag

98fafd,   35,   1

fh56w2,   25,   0

ggjeas,   55,   1

adfh5d,   45,   0

基本上我想要做的就是值列的值,但只有包括行,其中flag==1在计算中。

这甚至可能在R' 我已搜索周围并没有发现这样的事。

Answer 1:

这是一种可能性:

使用下面的命令来读取你的数据集:

newdata <- read.csv("stackoverflow questions/mediancol.csv")
# I assume you have the data in csv format

   # Showing the data I used for the computation
     newdata <- structure(list(hashID = structure(c(1L, 3L, 4L, 2L), .Label = c("98fafd", 
"adfh5d", "fh56w2", "ggjeas"), class = "factor"), value = c(35L, 
25L, 55L, 45L), flag = c(1L, 0L, 1L, 0L)), .Names = c("hashID", 
"value", "flag"), class = "data.frame", row.names = c(NA, -4L
))
    > newdata
  hashID value flag
1 98fafd    35    1
2 fh56w2    25    0
3 ggjeas    55    1
4 adfh5d    45    0

# Subset the data when flag =1
newdata1 <- subset(newdata,flag==1)

# Look at the summary of the data

> summary(newdata1)
    hashID      value         flag  
 98fafd:1   Min.   :35   Min.   :1  
 adfh5d:0   1st Qu.:40   1st Qu.:1  
 fh56w2:0   Median :45   Median :1  
 ggjeas:1   Mean   :45   Mean   :1  
            3rd Qu.:50   3rd Qu.:1  
            Max.   :55   Max.   :1

# Only look at the median 
median(newdata1$value)
[1] 45


Answer 2:

你也可以这样做在一个快速班轮与索引数据帧的布尔数组:

# read the data from a csv file
newdata <- read.csv("file.csv")
# this will give you a vector of boolean values of length nrow(newdata)
newdata$flag==1
# and this line uses the above vector to retrieve only those elements of 
# newdata$value for which the row contains a flag value of 1
median(newdata$value[newdata$flag==1])


文章来源: Getting Median of a Column where value of another Column is 1 in R
标签: r subset median