I am porting some C++ code to Python and one of the data structures is a multiset, but I am not sure how to model this in Python.
Let ms
be the C++ multiset<int>
How ms
is used (posting some examples)
multiset<int>::iterator it = ms.find(x)
ms.erase(it)
ms.insert(x)
ms.end()
ms.lower_bound(x)
ms.clear()
You can keep a list ordered using the bisect functions. For example
find
will becomeYou will find other equivalents in the docs. Instead of checking against
end
you will now get aValueError
There are a couple of data-structures which come close.
python collections:
provided by django framework:
There isn't. See Python's standard library - is there a module for balanced binary tree? for a general discussion of the equivalents of C++ tree containers (
map
,set
,multimap
,multiset
) in Python.The closest I can think of is to use a dictionary mapping integers to counts (also integers). However this doesn't get you the keys in order, so you can't search using
lower_bound
. An alternative is a to use an ordered list, as suggested by others already, maybe a list of (integer, count) tuples? If you only need to search after you've done all your insertions, you could use the dictionary as a temporary structure for construction, build the list after you've done all the insertions, then use the list for searching.There are a couple implementations of sorted list data types which would fit your criteria. Two popular choices are SortedContainers and blist modules. Each of these modules provides a SortedList data type which automatically maintains the elements in sorted order and would allow for fast insertion and lower/upper bound lookups. There's a performance comparison that's helpful too.
The equivalent code using the SortedList type from the SortedContainers module would be:
All of those operations should work efficiently on a sorted list data type.