可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
This question already has answers here:
Closed 2 years ago.
I have a data frame in R
like this:
ID MONTH-YEAR VALUE
110 JAN. 2012 1000
111 JAN. 2012 2000
. .
. .
121 FEB. 2012 3000
131 FEB. 2012 4000
. .
. .
So, for each month of each year there are n
rows and they can be in any order(mean they all are not in continuity and are at breaks). I want to calculate how many rows are there for each MONTH-YEAR
i.e. how many rows are there for JAN. 2012, how many for FEB. 2012 and so on. Something like this:
MONTH-YEAR NUMBER OF ROWS
JAN. 2012 10
FEB. 2012 13
MAR. 2012 6
APR. 2012 9
I tried to do this:
n_row <- nrow(dat1_frame %.% group_by(MONTH-YEAR))
but it does not produce the desired output.How can I do that?
回答1:
Here's an example that shows how table(.)
(or, more closely matching your desired output, data.frame(table(.))
does what it sounds like you are asking for.
Note also how to share reproducible sample data in a way that others can copy and paste into their session.
Here's the (reproducible) sample data:
mydf <- structure(list(ID = c(110L, 111L, 121L, 131L, 141L),
MONTH.YEAR = c("JAN. 2012", "JAN. 2012",
"FEB. 2012", "FEB. 2012",
"MAR. 2012"),
VALUE = c(1000L, 2000L, 3000L, 4000L, 5000L)),
.Names = c("ID", "MONTH.YEAR", "VALUE"),
class = "data.frame", row.names = c(NA, -5L))
mydf
# ID MONTH.YEAR VALUE
# 1 110 JAN. 2012 1000
# 2 111 JAN. 2012 2000
# 3 121 FEB. 2012 3000
# 4 131 FEB. 2012 4000
# 5 141 MAR. 2012 5000
Here's the calculation of the number of rows per group, in two output display formats:
table(mydf$MONTH.YEAR)
#
# FEB. 2012 JAN. 2012 MAR. 2012
# 2 2 1
data.frame(table(mydf$MONTH.YEAR))
# Var1 Freq
# 1 FEB. 2012 2
# 2 JAN. 2012 2
# 3 MAR. 2012 1
回答2:
The count()
function in plyr
does what you want:
library(plyr)
count(mydf, "MONTH-YEAR")
回答3:
Using the example data set that Ananda dummied up, here's an example using aggregate()
, which is part of core R. aggregate()
just needs something to count as function of the different values of MONTH-YEAR
. In this case, I used VALUE
as the thing to count:
aggregate(cbind(count = VALUE) ~ MONTH.YEAR,
data = mydf,
FUN = function(x){NROW(x)})
which gives you..
MONTH.YEAR count
1 FEB. 2012 2
2 JAN. 2012 2
3 MAR. 2012 1
回答4:
Try using the count function in dplyr:
library(dplyr)
dat1_frame %>%
count(MONTH.YEAR)
I am not sure how you got MONTH-YEAR as a variable name. My R version does not allow for such a variable name, so I replaced it with MONTH.YEAR.
As a side note, the mistake in your code was that dat1_frame %.% group_by(MONTH-YEAR)
without a summarise
function returns the original data frame without any modifications. So, you want to use
dat1_frame %>%
group_by(MONTH.YEAR) %>%
summarise(count=n())
回答5:
library(plyr)
ddply(data, .(MONTH-YEAR), nrow)
This will give you the answer, if "MONTH-YEAR" is a variable.
First, try unique(data$MONTH-YEAR) and see if it returns unique values (no duplicates).
Then above simple split-apply-combine will return what you are looking for.
回答6:
Just for completion the data.table solution:
library(data.table)
mydf <- structure(list(ID = c(110L, 111L, 121L, 131L, 141L),
MONTH.YEAR = c("JAN. 2012", "JAN. 2012",
"FEB. 2012", "FEB. 2012",
"MAR. 2012"),
VALUE = c(1000L, 2000L, 3000L, 4000L, 5000L)),
.Names = c("ID", "MONTH.YEAR", "VALUE"),
class = "data.frame", row.names = c(NA, -5L))
setDT(mydf)
mydf[, .(`Number of rows` = .N), by = MONTH.YEAR]
MONTH.YEAR Number of rows
1: JAN. 2012 2
2: FEB. 2012 2
3: MAR. 2012 1
回答7:
Suppose we have a df_data data frame as below
> df_data
ID MONTH-YEAR VALUE
1 110 JAN.2012 1000
2 111 JAN.2012 2000
3 121 FEB.2012 3000
4 131 FEB.2012 4000
5 141 MAR.2012 5000
To count number of rows in df_data grouped by MONTH-YEAR column, you can use:
> summary(df_data$`MONTH-YEAR`)
FEB.2012 JAN.2012 MAR.2012
2 2 1
summary function will create a table from the factor argument, then create a vector for the result (line 7 & 8)
回答8:
Here is another way of using aggregate
to count rows by group:
my.data <- read.table(text = '
month.year my.cov
Jan.2000 apple
Jan.2000 pear
Jan.2000 peach
Jan.2001 apple
Jan.2001 peach
Feb.2002 pear
', header = TRUE, stringsAsFactors = FALSE, na.strings = NA)
rows.per.group <- aggregate(rep(1, length(my.data$month.year)),
by=list(my.data$month.year), sum)
rows.per.group
# Group.1 x
# 1 Feb.2002 1
# 2 Jan.2000 3
# 3 Jan.2001 2