I have a dataframe that contains a plot ID (plotID), tree species code (species), and a cover value (cover). You can see there are multiple records of tree species within one of the plots. How can I sum the "cover" field if there are duplicate "species" rows within each plot?
For example, here is some sample data:
# Sample Data
plotID = c( "SUF200001035014", "SUF200001035014", "SUF200001035014", "SUF200001035014", "SUF200001035014", "SUF200046012040",
"SUF200046012040", "SUF200046012040", "SUF200046012040", "SUF200046012040", "SUF200046012040", "SUF200046012040")
species = c("ABBA", "BEPA", "PIBA2", "PIMA", "PIRE", "PIBA2", "PIBA2", "PIMA", "PIMA", "PIRE", "POTR5", "POTR5")
cover = c(26.893939, 5.681818, 9.469697, 16.287879, 1.893939, 16.287879, 4.166667, 10.984848, 16.666667, 11.363636, 18.181818,
13.257576)
df_original = data.frame(plotID, species, cover)
And here is the intended output:
# Intended Output
plotID2 = c( "SUF200001035014", "SUF200001035014", "SUF200001035014", "SUF200001035014", "SUF200001035014", "SUF200046012040",
"SUF200046012040", "SUF200046012040", "SUF200046012040")
species2 = c("ABBA", "BEPA", "PIBA2", "PIMA", "PIRE", "PIBA2", "PIMA", "PIRE", "POTR5")
cover2 = c(26.893939, 5.681818, 9.469697, 16.287879, 1.893939, 20.454546, 18.651515, 11.363636, 31.439394)
df_intended_output = data.frame(plotID2, species2, cover2)
As mentioned above, ddply from the plyr package
You can do this in a number of ways. Using base-r,
dplyr
anddata.table
would be the most typical.Here is
dplyr
's way:This would be the base-r way:
And a data.table way -
Easy with
aggregate
Easier with
data.table