Julia: Create summary values for column x for each

2019-05-23 12:02发布

I would like to apply some functions such as mean and variance to column x of my DataFrame for each unique value in column y. I can imagine building a loop that manually subsets the DataFrame to accomplish my end but I am trying not to reinvent the wheel for something which is likely a common feature.

using DataFrames
mydf = DataFrame(y = [randstring(1) for i in 1:1000], x = rand(1000))
# I could imagine a function that looks like:
apply(function = mean, across = mydf[:x], by = mydf[:y])

标签： dataframe julia

1条回答

兄弟一词,经得起流年.

2楼-- · 2019-05-23 12:10

You're right this is very common. Take a look at the split-apply-combine chapter in the documentation. There are several approaches here: you can either use the more general by function to specify exactly what columns you want to operate over, or you can use the handy aggregate function to use all the other columns and automatically name them sensibly:

julia> aggregate(mydf, :y, mean)
62×2 DataFrames.DataFrame
│ Row │ y   │ x_mean   │
├─────┼─────┼──────────┤
│ 1   │ "0" │ 0.454196 │
│ 2   │ "1" │ 0.541434 │
│ 3   │ "2" │ 0.36734  │
⋮

0人赞添加讨论(0) 举报

Julia: Create summary values for column x for each

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间