count the frequency that a value occurs in a dataf-第2页回答

I have a dataset

|category|
cat a
cat b
cat a

I'd like to be able to return something like (showing unique values and frequency)

category | freq |
cat a       2
cat b       1

标签： python pandas

14条回答

呛了眼睛熬了心

2楼-- · 2019-01-01 00:50

Without any libraries, you could do this instead:

def to_frequency_table(data):
    frequencytable = {}
    for key in data:
        if key in frequencytable:
            frequencytable[key] += 1
        else:
            frequencytable[key] = 1
    return frequencytable

Example:

to_frequency_table([1,1,1,1,2,3,4,4])
>>> {1: 4, 2: 1, 3: 1, 4: 2}

0人赞添加讨论(0) 举报

公子世无双

3楼-- · 2019-01-01 00:50

Use size() method:

    import pandas as pd
    print df.groupby['category'].size()
    #where df is your dataframe

0人赞添加讨论(0) 举报

还给你的自由

4楼-- · 2019-01-01 00:55

Using list comprehension and value_counts for multiple columns in a df

[my_series[c].value_counts() for c in list(my_series.select_dtypes(include=['O']).columns)]

https://stackoverflow.com/a/28192263/786326

0人赞添加讨论(0) 举报

孤独总比滥情好

5楼-- · 2019-01-01 01:01

Use groupby and count:

In [37]:
df = pd.DataFrame({'a':list('abssbab')})
df.groupby('a').count()

Out[37]:

   a
a   
a  2
b  3
s  2

[3 rows x 1 columns]

See the online docs: http://pandas.pydata.org/pandas-docs/stable/groupby.html

Also value_counts() as @DSM has commented, many ways to skin a cat here

In [38]:
df['a'].value_counts()

Out[38]:

b    3
a    2
s    2
dtype: int64

If you wanted to add frequency back to the original dataframe use transform to return an aligned index:

In [41]:
df['freq'] = df.groupby('a')['a'].transform('count')
df

Out[41]:

   a freq
0  a    2
1  b    3
2  s    2
3  s    2
4  b    3
5  a    2
6  b    3

[7 rows x 2 columns]

0人赞添加讨论(0) 举报

若你有天会懂

6楼-- · 2019-01-01 01:01

In 0.18.1 groupby together with count does not give the frequency of unique values:

>>> df
   a
0  a
1  b
2  s
3  s
4  b
5  a
6  b

>>> df.groupby('a').count()
Empty DataFrame
Columns: []
Index: [a, b, s]

However, the unique values and their frequencies are easily determined using size:

>>> df.groupby('a').size()
a
a    2
b    3
s    2

With df.a.value_counts() sorted values (in descending order, i.e. largest value first) are returned by default.

0人赞添加讨论(0) 举报

爱死公子算了

7楼-- · 2019-01-01 01:01

You can also do this with pandas by broadcasting your columns as categories first, e.g. dtype="category" e.g.

cats = ['client', 'hotel', 'currency', 'ota', 'user_country']

df[cats] = df[cats].astype('category')

and then calling describe:

df[cats].describe()

This will give you a nice table of value counts and a bit more :):

    client  hotel   currency    ota user_country
count   852845  852845  852845  852845  852845
unique  2554    17477   132 14  219
top 2198    13202   USD Hades   US
freq    102562  8847    516500  242734  340992

0人赞添加讨论(0) 举报

count the frequency that a value occurs in a dataf

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间