我有以下数据框:
df<-structure(list(totprivland = c(175L, 50L, 100L, 14L, 4L, 240L,
10L, 20L, 20L, 58L), ncushr8d1 = c(0L, 0L, 0L, 0L, 0L, 30L, 5L,
0L, 0L, 50L), ncu_CENREG1 = structure(c(4L, 4L, 4L, 4L, 1L, 3L,
3L, 3L, 4L, 4L), .Label = c("Northeast", "Midwest", "South",
"West"), class = "factor"), ncushr8d2 = c(75L, 50L, 100L, 14L,
2L, 30L, 5L, 20L, 20L, 8L), ncu_CENREG2 = structure(c(4L, 4L,
4L, 4L, 1L, 2L, 1L, 4L, 3L, 4L), .Label = c("Northeast", "Midwest",
"South", "West"), class = "factor"), ncushr8d3 = c(100L, NA,
NA, NA, 2L, 180L, 0L, NA, NA, NA), ncu_CENREG3 = structure(c(4L,
NA, NA, NA, 1L, 1L, 3L, NA, NA, NA), .Label = c("Northeast",
"Midwest", "South", "West"), class = "factor"), ncushr8d4 = c(NA,
NA, NA, NA, 0L, NA, NA, NA, NA, NA), ncu_CENREG4 = structure(c(NA,
NA, NA, NA, 1L, NA, NA, NA, NA, NA), .Label = c("Northeast",
"Midwest", "South", "West"), class = "factor")), .Names = c("totprivland",
"ncushr8d1", "ncu_CENREG1", "ncushr8d2", "ncu_CENREG2", "ncushr8d3",
"ncu_CENREG3", "ncushr8d4", "ncu_CENREG4"), row.names = c(27404L,
27525L, 27576L, 27822L, 28099L, 28238L, 28306L, 28312L, 28348L,
28379L), class = "data.frame")
=======
这是dput
以下基本思路:
Total VariableA LocationA VariableB LocationB
30 20 East 10 East
20 20 South NA West
115 15 East 100 South
100 50 West 50 West
35 10 East 25 South
总(或在dput示例totprivland)是变量(ncushr8d1,ncushr8d2,ncushr8d3,和ncushr8d4)的总和,并且每个变量具有相应的因子位置的变量(ncu_CENREG1等)。 有6个额外的变量和位置在这相同的模式。 位置的变量是常常用于多个数值变量(例如多个“东”位置值等的例子中的第一行)相同的值。
我想获得的值的总和为每一行通过共同的区位因素,创造了每个位置的总和的新列。 这将是这个样子,与忽略NA值的能力:
Total VariableA LocationA VariableB LocationB TotalWest TotalEast TotalSouth
30 20 East 10 East 0 30 0
20 20 South NA NA 0 0 20
115 15 East 100 South 0 15 100
100 50 West 50 West 100 0 0
35 10 East 25 South 0 10 25
我看着聚集和分裂,但似乎无法弄清楚如何让他们的工作遇到了许多列。 我也在考虑一个漫长的“如果”,将通过所有8个变量及其对应的位置旋转,但觉得必须有这样一个更好的解决方案声明。 这些意见进行加权的调查套餐使用,我想,以避免重复的意见,使他们“长”与重塑包,但也许我可以将它们在以后重新结合。 任何建议表示赞赏!
许多感谢,卢克