Calculating frequency distribution of a collection

2019-07-17 15:33发布

Is there a fast/simple way to calculate the frequency distribution of a .Net collection using Linq or otherwise?

For example: An arbitrarily long List contains many repetitions. What's a clever way of walking the list and counting/tracking repetitions?

标签： .net collections frequency-distribution

3条回答

ゆ、 Hurt°

2楼-- · 2019-07-17 16:09

The easiest way is to use a hashmap and either use the value as the key and increment the value, or pick a bucket size (bucket 1 = 1 - 10, bucket 2 = 11 - 20, etc), and increment each bucket by the value.

Then you can go through and determine the frequencies.

0人赞添加讨论(0) 举报

趁早两清

3楼-- · 2019-07-17 16:15

The C5 generic collections library has a HashBag implementation that accepts duplicates by counting. The following pseudo-code would get you what you're looking for:

var hash = new HashBag();
hash.AddAll(list);
var mults = hash.ItemMultiplicities();

(where K is the type of the items in your list) mults will then contain an IDictionary<K,int> where the list item is the key and the multiplicity is the value.

0人赞添加讨论(0) 举报

霸刀☆藐视天下

4楼-- · 2019-07-17 16:24

The simplest way to find duplicate items in a list is to group it, like this:

var dups = list.GroupBy(i => i).Where(g => g.Skip(1).Any());

(Writing Skip(1).Any() should be faster than (Count() > 1) because it won't have to traverse more than two items from each group. However, the difference is probably negligible unless list's enumerator is slow)

0人赞添加讨论(0) 举报

Calculating frequency distribution of a collection

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间