adding rows to data.frame conditionally

2019-02-27 03:55发布

问题:

I have a big data.frame of flowers and fruits in a plant for a 30 years survey. I want to add zeros (0) in some rows which represent individuals in specific months where the plant did not have flowers or fruits (because it is a seasonal species).

Example:

Year Month Flowers Fruits
2004 6      25      2
2004 7      48      4
2005 7      20      1
2005 8      16      1

I want to add the months that are not included with values of zero so I was thinking in a function that recognize the missing months and fill them with 0.

Thanks.

回答1:

## x is the data frame you gave in the question

x <- data.frame(
  Year = c(2004, 2004, 2005, 2005),
  Month = c(6, 7, 7, 8),
  Flowers = c(25, 48, 20, 16),
  Fruits = c(2, 4, 1, 1)
)

## y is the data frame that will provide the missing values,
## so you can replace 2004 and 2005 with whatever your desired
## time interval is

y <- expand.grid(Year = 2004:2005, Month = 1:12)

## this final step fills in missing dates and replaces NA's with zeros

library(tidyr)
x <- merge(x, y, all = TRUE) %>%
  replace_na(list(Flowers = 0, Fruits = 0))

## if you don't want to use tidyr, you can alternatively do

x <- merge(x, y, all = TRUE)
x[is.na(x)] <- 0

It looks like this:

head(x, 10)

#    Year Month Flowers Fruits
# 1  2004     1       0      0
# 2  2004     2       0      0
# 3  2004     3       0      0
# 4  2004     4       0      0
# 5  2004     5       0      0
# 6  2004     6      25      2
# 7  2004     7      48      4
# 8  2004     8       0      0
# 9  2004     9       0      0
# 10 2004    10       0      0


回答2:

Here is another option using expand and left_join

library(dplyr)
library(tidyr)
expand(df1, Year, Month = 1:12) %>% 
      left_join(., df1) %>%
      replace_na(list(Flowers=0, Fruits=0))
#    Year Month Flowers Fruits
#   <int> <int>   <dbl>  <dbl>
#1   2004     1       0      0
#2   2004     2       0      0
#3   2004     3       0      0
#4   2004     4       0      0
#5   2004     5       0      0
#6   2004     6      25      2
#7   2004     7      48      4
#8   2004     8       0      0
#9   2004     9       0      0
#10  2004    10       0      0
#..   ...   ...     ...    ...