So I have an std::set which needs to keep specific ordering as well as not allowing duplicates of a user defined (by me) type. Now I can get the order to work correctly by overloading the '<' operator in my type. However, the set does not appropriately detect duplicates, and to be honest I'm not entirely sure how it does this internally. I have overloaded the '==' operator, but somehow im not sure this is what the set is actually using? So the question is how does the set determine duplicates when you add values? Here is the relevant code:
The user defined type:
//! An element used in the route calculation.
struct RouteElem {
int shortestToHere; // Shortest distance from the start.
int heuristic; // The heuristic estimate to the goal.
Coordinate position;
bool operator<( const RouteElem& other ) const
{
return (heuristic+shortestToHere) < (other.heuristic+other.shortestToHere);
}
bool operator==( const RouteElem& other ) const
{
return (position.x == other.position.x && position.y == other.position.y);
}
};
So the elements are equivalent when their position is equivalent, and an element is less than another if its combined functional is less than that of the other. The sorting works, but the set will accept two elements of the same position.
operator==
is not used bystd::set
. Elementsa
andb
are considered equal iff!(a < b) && !(b < a)
The STL set implementation does something conceptually like this to detect equality:
That is, if two elements are both not less than the other, then they must be equal. You may be able to check this by setting a breakpoint on your
operator==()
method and checking to see whether it ever gets called at all.I would generally be suspicious of comparison operators that check completely different things. Your
<
operator is defined in terms of two things that are separate from how your==
operator is defined. Generally you will want such comparisons to use consistent information.You could try something like the following:
Obviously there's the copy in the middle, which looks slow. But any structure which indexes items by two different criteria is therefore going to have some kind of extra overhead per item compared with a set. The whole of the code above is O(n log n), assuming that your implementation of std::sort uses introsort.
If you have it, under this scheme you could use
unordered_set
instead ofset
to do the initial uniqueifying. Since the hash would only have to depend on x and y, it should be faster than the O(log N) comparisons required to insert into a set.Edit: just noticed that you said you wanted to "keep" sort order, not that you wanted to process everything in a batch. Sorry about that. If you want to efficiently maintain order and exclude duplicates while adding elements, then I would recommend using the set or unordered set I define above, based on position, and also a
std::multiset<RouteElem>
, which will maintain theoperator<
order. For each new element, do:Although beware that this offers no exception guarantee. If the second insert throws, then the routeset has been modified, so the state is no longer consistent. So I guess really you need:
Or an equivalent with RAII, which will be more verbose if there's only one place in your code you ever use the RAII class, but better if there's much repetition.
std::set
supports specifying a comparison function. The default isless
which will useoperator <
to check equality. You can define a custom function to check equality and use that one instead:Note that it's not possible to separate the comparison function from sorting function.
std::set
is a binary tree and if an element in a binary tree is not bigger nor smaller than a specific element, it should be in the same place. It does something like this in the place finding algorithm:rlbond's comparator does not prevent the insertion of elements which compare equal. Apparently it's difficult to prove this in comments, given the character limit, because rlbond appears to thinks that std::set guarantees that it will never contain two elements with
!compare(a,b) && !compare(b,a)
for his comparator. However, rlbond's comparator does not define a strict order, and therefore is not a valid parameter to std::set.Output:
Duplicates. The magic comparator has failed. Different elements in the set have the same value of
equality
, and hence compare the same withoperator==
, because during insertion the set never compared the new element against its duplicate. The only duplicate which was excluded was 4, because the two 4's had sort orders 4 and 6. This put them close enough together in the set to be compared against each other.From the C++ standard: 25.3:3 "For the algorithms to work correctly, comp has to induce a strict weak ordering on the values".
25.3:4 "... the requirements are that comp and equiv both be transitive relations:
Now, consider the elements
a = BrokenOrder(1,1)
,b = BrokenOrder(2,2)
, andc = BrokenOrder(9,1)
, andcomp
of course equal to the magic comparator. Then:comp(a,b)
is true since 1 != 2 (equality) and 1 < 2 (order)comp(b,c)
is true since 2 != 1 (equality) and 2 < 9 (order)comp(a,c)
is false since 1 == 1 (equality)Beware of the ramifications of this. It looks like you are trying to do something like A*, and if you try to insert a "duplicate" it will be ignored, even if there is a "better" route.
NOTE: This solution doesn't work, see onebyone's explanation below