Getting different results using aggregate() and su

2019-09-19 01:47发布

I'm trying to get a summary data frame of the total quantities of the variables prop.damage and crop.damage by STATE variable using the aggregate() function in R with the following code:

stormdata$prop.damage <- with(stormdata, ifelse(PROPDMGEXP == 'K', (PROPDMG * 10^3), ifelse(PROPDMGEXP == 'M', (PROPDMG * 10^6), ifelse(PROPDMGEXP == 'B', (PROPDMG * 10^9), NA))))
stormdata$crop.damage <- with(stormdata, ifelse(CROPDMGEXP == 'K', (CROPDMG * 10^3), ifelse(CROPDMGEXP == 'M', (CROPDMG * 10^6), ifelse(CROPDMGEXP == 'B', (CROPDMG * 10^9), NA))))
damagecost <- with(stormdata, aggregate(x = prop.damage + crop.damage, by = list(STATE), FUN = sum, na.rm = TRUE))
damagecost <- damagecost[order(damagecost$x, decreasing = TRUE), ]

Here the PROPDMGEXP and CROPDMGEXP variables are used as a multiplier for the PROPDMG and CROPDMG numeric variables. My main data set is stormdata.

And I get the following:

> head(damagecost)
   Group.1            x
8       CA 120211639720
13      FL  27302948100
38      MS  14804212820
63      TX  12550131850
20      IL  11655920860
2       AL   9505473250

But, for example, If I do the addition "manually" for California ('CA') I get this:

> sum(stormdata$prop.damage[stormdata$STATE == 'CA'], na.rm = TRUE) + sum(stormdata$crop.damage[stormdata$STATE == 'CA'], na.rm = TRUE)
[1] 127115859410

I don't understand why I'm getting different results.

标签： sum aggregate

1条回答

太酷不给撩

2楼-- · 2019-09-19 02:18

Turns out that both variables prop.damage and crop.damage had NA values within them and those NAs were affecting the result when the variables were added in the aggregate function.

0人赞添加讨论(0) 举报

Getting different results using aggregate() and su

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间