可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I created a simple bidirectional map class that works by internally storing two std::map
instances, with opposite key/value types, and providing an user-friendly interface:
template<class T1, class T2> class Bimap
{
std::map<T1, T2> map1;
std::map<T2, T1> map2;
// ...
};
EDIT:
Should bimap element be mutable or immutable? (Changing one element in map1
should change the key in map2
, but keys are const and that's impossible - what's the solution?)
Ownership of elements is also another problem: when an user inserts a key-value pair in the bimap, the bimap should make a copy of that key-value pair and store it, then the internal second map (with inverted key/value) should not copy but point to the original pair. How can this be achieved?
EDIT 2:
I've posted a possible implementation I made on Code Review.
回答1:
There is a certain problem with double-storing your data in all simple implementations of a bimap. If you can break it down to a bimap of pointers from outside, then you can readily ignore this and simply keep both maps of the form std::map<A*,B*>
like Arkaitz Jimenez already suggested (though contrary to his answer you have to care about the storage from outside to avoid a A->A*
lookup). But if you have the pointers anyway, why not simply store a std::pair<A,B>
at the point where you would otherwise store A
and B
separately?
It would be nice to have std::map<A,B*>
instead of std::map<A*,B*>
as this would allow for example the lookup of an element associated to an string by a newly created string with the same content instead of the pointer to the original string that created the pair. But it is customary to store a full copy of the key with every entry and only rely on the hash to find the right bucket. This way the returned item will be the correct one even in the case of a hash-collision...
If you want to have it quick and dirty though, there is this
hackish solution:
Create two maps std::map<size_t, A> mapA
and std::map<size_t, B> mapB
. Upon insertion hash both elements that are to be inserted to get the keys to the respective maps.
void insert(const A &a, const B &b) {
size_t hashA = std::hash<A>(a);
size_t hashB = std::hash<B>(b);
mapA.insert({hashB, a});
mapB.insert({hashA, b});
}
Lookup is implemented analogously.
Using a multimap
instead of a map
and verifying every element you get with a lookup in the respectivly other map (get candidate b
from mapA
, hash b
and look in mapB
if it matches the wanted key, iterate to the next candidate b otherwise) this is a valid implementation - but still hackish in my opinion...
You can get a much nicer solution by using the copies of the elements that are used to compare the entries (see above) as only storage. It is a bit harder to get your head around that though. To elaborate:
a nicer solution:
Create two sets of pairs as std::set<pair<A, B*>>
and std::set<pair<B, A*>>
and overload the operator<
and operator==
to only take the first element of the pairs into account (or provide an corresponding comparion class). It is necessary to create sets of pairs instead of maps (which internally look similarly) because we need a guarantee that A
and B
will always be in the same positions in memory. Upon insertion of an pair<A,B>
we split it into two elements that fit into the above sets.
std::set<pair<B, A*>> mapA;
std::set<pair<A, B*>> mapB;
void insert(const A &a, const B &b) {
auto aitr = mapA.insert({b, nullptr}).first; // creates first pair
B *bp = &(aitr->first); // get pointer of our stored copy of b
auto bitr = mapB.insert({a, bp}).first;
// insert second pair {a, pointer_to_b}
A *ap = &(bitr->first); // update pointer in mapA to point to a
aitr->second = ap;
}
Lookup can now simply be done by a simple std::set
lookup and a pointer dereference.
This nicer solution is similar to the solution that boost uses - even though they use some annonymized pointers as second elements of the pairs and thus have to use reinterpret_cast
s.
Note that the .second
part of the pairs need to be mutable (so I'm not sure std::pair
can be used), or you have to add another layer of abstraction (std::set<pair<B, A**>> mapA
) even for this simple insertion. In both solutions you need temporary elements to return non-const references to elements.
回答2:
It would be more efficient to store all elements in a vector and have 2 maps of <T1*,T2*>
and <T2*,T1*>
that way you would not have everything copied twice.
The way I see it you are trying to store 2 things, elements themselves and the relationship between them, if you are aiming to scalar types you could leave it as is 2 maps, but if you aim to treat complex types it makes more sense to separate the storage from the relationships, and handle relationships outside the storage.
回答3:
Boost Bimap makes use of Boost Mutant Idiom.
From the linked wikipedia page:
Boost mutant idiom makes use of reinterpret_cast and depends heavily on assumption that the memory layouts of two different structures with identical data members (types and order) are interchangeable. Although the C++ standard does not guarantee this property, virtually all the compilers satisfy it.
template <class Pair>
struct Reverse
{
typedef typename Pair::first_type second_type;
typedef typename Pair::second_type first_type;
second_type second;
first_type first;
};
template <class Pair>
Reverse<Pair> & mutate(Pair & p)
{
return reinterpret_cast<Reverse<Pair> &>(p);
}
int main(void)
{
std::pair<double, int> p(1.34, 5);
std::cout << "p.first = " << p.first << ", p.second = " << p.second << std::endl;
std::cout << "mutate(p).first = " << mutate(p).first << ", mutate(p).second = " << mutate(p).second << std::endl;
}
The implementation in boost sources is of course fairly hairier.
回答4:
If you create a set of pairs to your types std::set<std::pair<X,Y>>
you pretty much have your functionallity implemented and rules about mutabillity and constness preset (OK maybe the settings aren't what you want but tweaks can be made). So here is the code :
#ifndef MYBIMAP_HPP
#define MYBIMAP_HPP
#include <set>
#include <utility>
#include <algorithm>
using std::make_pair;
template<typename X, typename Y, typename Xless = std::less<X>,
typename Yless = std::less<Y>>
class bimap
{
typedef std::pair<X, Y> key_type;
typedef std::pair<X, Y> value_type;
typedef typename std::set<key_type>::iterator iterator;
typedef typename std::set<key_type>::const_iterator const_iterator;
struct Xcomp
{
bool operator()(X const &x1, X const &x2)
{
return !Xless()(x1, x2) && !Xless()(x2, x1);
}
};
struct Ycomp
{
bool operator()(Y const &y1, Y const &y2)
{
return !Yless()(y1, y2) && !Yless()(y2, y1);
}
};
struct Fless
{ // prevents lexicographical comparison for std::pair, so that
// every .first value is unique as if it was in its own map
bool operator()(key_type const &lhs, key_type const &rhs)
{
return Xless()(lhs.first, rhs.first);
}
};
/// key and value type are interchangeable
std::set<std::pair<X, Y>, Fless> _data;
public:
std::pair<iterator, bool> insert(X const &x, Y const &y)
{
auto it = find_right(y);
if (it == end()) { // every .second value is unique
return _data.insert(make_pair(x, y));
}
return make_pair(it, false);
}
iterator find_left(X const &val)
{
return _data.find(make_pair(val,Y()));
}
iterator find_right(Y const &val)
{
return std::find_if(_data.begin(), _data.end(),
[&val](key_type const &kt)
{
return Ycomp()(kt.second, val);
});
}
iterator end() { return _data.end(); }
iterator begin() { return _data.begin(); }
};
#endif
Example usage
template<typename X, typename Y, typename In>
void PrintBimapInsertion(X const &x, Y const &y, In const &in)
{
if (in.second) {
std::cout << "Inserted element ("
<< in.first->first << ", " << in.first->second << ")\n";
}
else {
std::cout << "Could not insert (" << x << ", " << y
<< ") because (" << in.first->first << ", "
<< in.first->second << ") already exists\n";
}
}
int _tmain(int argc, _TCHAR* argv[])
{
bimap<std::string, int> mb;
PrintBimapInsertion("A", 1, mb.insert("A", 1) );
PrintBimapInsertion("A", 2, mb.insert("A", 2) );
PrintBimapInsertion("b", 2, mb.insert("b", 2));
PrintBimapInsertion("z", 2, mb.insert("z", 2));
auto it1 = mb.find_left("A");
if (it1 != mb.end()) {
std::cout << std::endl << it1->first << ", "
<< it1->second << std::endl;
}
auto it2 = mb.find_right(2);
if (it2 != mb.end()) {
std::cout << std::endl << it2->first << ", "
<< it2->second << std::endl;
}
return 0;
}
NOTE: All this is a rough code sketching of what a full implementation would be and even after polishing and extending the code I'm not implying that this would be an alternative to boost::bimap
but merely a homemade way of having an associative container searchable by both the value and the key.
Live example
回答5:
One possible implementation that uses an intermediate data structure and an indirection is:
int sz; // total elements in the bimap
std::map<A, int> mapA;
std::map<B, int> mapB;
typedef typename std::map<A, int>::iterator iterA;
typedef typename std::map<B, int>::iterator iterB;
std::vector<pair<iterA, iterB>> register;
// All the operations on bimap are indirected through it.
Insertion
Suppose you have to insert (X, Y) where X, Y are instances of A and B respectively, then:
- Insert (X, sz) in
mapA
--- O(lg sz)
- Insert (Y, sz) in
mapB
--- O(lg sz)
- Then
push_back
(IterX, IterY) in register
--- O(1). Here IterX and IterY are iterators to the corresponding element in mapA
and mapB
and are obtained from (1) and (2) respectively.
Lookup
Lookup for the image of an element, X, of type A:
- Get the int mapped to X in
mapA
. --- O(lg n)
- Use the int to index into
register
and get corresponding IterY. --- O(1)
- Once you have IterY, you can get Y through
IterY->first
. --- O(1)
So both the operations are implemented in O(lg n).
Space: All the copies of the objects of A and B are required to be stored only once. There is, however, a lot of bookkeeping stuff. But when you have large objects, that would also be not much significant.
Note: This implementation relies on the fact that the iterators of a map are never invalidated. Hence, the contents of register
are always valid.
A more elaborate version of this implementation can be found here
回答6:
How about this?
Here, we avoid double storage of one type (T1). The other type (T2) is still double stored.
// Assume T1 is relatively heavier (e.g. string) than T2 (e.g. int family).
// If not, client should instantiate this the other way.
template <typename T1, typename T2>
class UnorderedBimap {
typedef std::unordered_map<T1, T2> Map1;
Map1 map1_;
std::unordered_map<T2, Map1::iterator> map2_;
};