In numpy
/ scipy
, is there an efficient way to get frequency counts for unique values in an array?
Something along these lines:
x = array( [1,1,1,2,2,2,5,25,1,1] )
y = freq_count( x )
print y
>> [[1, 5], [2,3], [5,1], [25,1]]
( For you, R users out there, I'm basically looking for the table()
function )
some thing like this should do it:
Also, this previous post on Efficiently counting unique elements seems pretty similar to your question, unless I'm missing something.
To count unique non-integers - similar to Eelco Hoogendoorn's answer but considerably faster (factor of 5 on my machine), I used
weave.inline
to combinenumpy.unique
with a bit of c-code;Profile info
Eelco's pure
numpy
version:Note
There's redundancy here (
unique
performs a sort also), meaning that the code could probably be further optimized by putting theunique
functionality inside the c-code loop.numpy.bincount
is the probably the best choice. If your array contains anything besides small dense integers it might be useful to wrap it something like this:For example:
Using pandas module:
dtype: int64
Even though it has already been answered, I suggest a different approach that makes use of
numpy.histogram
. Such function given a sequence it returns the frequency of its elements grouped in bins.Beware though: it works in this example because numbers are integers. If they where real numbers, then this solution would not apply as nicely.
This gives you: {1: 5, 2: 3, 5: 1, 25: 1}