可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a data frame where each line represents an individual. That data frame has two variables: age and year. I want to make a table of average ages per year. How can I do it?

The best I could come up with was xtabs(age ~ year, dataframe), but this gives me the sum of ages per year.

回答1:

Use aggregate:

xtabs(hp~cyl+gear,aggregate(hp~cyl+gear,mtcars,mean))
   gear
cyl        3        4        5
  4  97.0000  76.0000 102.0000
  6 107.5000 116.5000 175.0000
  8 194.1667   0.0000 299.5000

回答2:

Have a look at the plyr package, specifically ddply

ddply(dataframe, .(year), summarise, mean(age))

If you have to use xtabs... see the above answer.

回答3:

I'm glad that xtabs solution works for you. I must have a package loaded that interferes with it for me (and gives an error). Another solution would be:

tapply(dfrm$age, dfrm$year, FUN=mean)

To get additional dimensions to the table (array) just keep adding additional factors to the second INDEX argument in the form list(fac1, fac2, fac3).

Applied to the example using mtcars:

 tapply(mtcars$hp, list(mtcars$cyl,mtcars$gear), mean)
         3     4     5
4  97.0000  76.0 102.0
6 107.5000 116.5 175.0
8 194.1667    NA 299.5

Or even more compactly:

with(mtcars, tapply(hp, list(cyl, gear), mean))

回答4:

Another solution is to use 2 xtabs functions, first one to calculate the sum of the ages, 2nd one to get the headcount for each group, then get the mean by sum/headcount.

E.g., xtabs(age ~ year, dataframe)/xtabs( ~ year, dataframe)

However, the aggregate approach can be used to calculate median or other functions applied to the dataframe. Definitely more flexible.