How to get number of groups in a groupby object in

2019-01-27 12:21发布

This would be useful so I know how many unique groups I have to perform calculations on. Thank you.

Suppose groupby object is called dfgroup.

2条回答
兄弟一词,经得起流年.
2楼-- · 2019-01-27 12:43

Setup

np.random.seed(0)
df = pd.DataFrame({
    'A': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],
    'B': ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
    'C': np.random.choice(10, 8),
    'D': np.random.choice(10, 8)})

df
     A      B  C  D
0  foo    one  5  2
1  bar    one  0  4
2  foo    two  3  7
3  bar  three  3  6
4  foo    two  7  8
5  bar    two  9  8
6  foo    one  3  1
7  foo  three  5  6

g = df.groupby(['A', 'B'])

As of v0.23, there are a multiple possible options to use.

ngroups

Newer versions of the groupby API provide this (undocumented) attribute which stores the number of groups in a GroupBy object.

g.ngroups
# 6

len

You can either call len on the GroupBy object directly, or on the GroupBy.groups attribute. This delegates to GroupBy.__len__ to retrieve the number of groups.

len(g)
# 6

len(g.groups)    
# 6

Generator Expression

For completeness, you can also iterate over groups.

sum(1 for _ in g)
# 6

If you wanted to actually print those groups out, you could do something like

# from __future__ import print_function # python-2.7 users
print(*(g_ for _, g_ in g), sep='\n\n')

Addendum
If you're looking to find the size of each group, you can use DataFrameGroupBy.size:

g.size()
A    B    
bar  one      1
     three    1
     two      1
foo  one      2
     three    1
     two      2
dtype: int64

Note that size counts NaNs as well. If you don't want NaNs counted, use g.count() instead.

And lastly, there's also the option of value_counts through df.groupby('A').B.value_counts() which gives the exact same result as g.count() but does the grouping on one column instead of two.

查看更多
smile是对你的礼貌
3楼-- · 2019-01-27 12:54

As documented, you can get the number of groups with len(dfgroup).

查看更多
登录 后发表回答