Union iterator for maps?

2020-06-16 04:28发布

问题:

[Preface: The associative C++ containers like std::map are a bit like micro-databases with just one key column. Boost's bimap elevates this to a two-column table with lookup in both columns, but that that's as far as the analogy goes -- there's no "polymap" that generalizes the idea.]

In any event, I want to keep thinking of maps as databases, and I now wonder if there is an iterator (or some other solution) that allows me to do a UNION of several constituent maps. That is, all maps have the same type (or value type and comparator, at least), and I want a single iterator that treats the entire collection as a big multimap (repeated keys are OK) and lets me traverse it in the correct unioned order.

Does such a thing exist, perhaps within Boost? Or is it easy to rig one up? In pseudo code:

std::map<K, M> m1, m2;
union_iterator<K, M> u(m1, m2)
for(auto it = u.begin(); it != u.end(); ++it) { /* ... */ }

For example, if we had:

m1 = { { 9:00, "Check in"}, { 12:00, "Break" }, { 16:00, "Check out"} };
m2 = { { 10:30, "coffee" }, { 12:15, "baked beans" }, { 15:00, "lies" } };

then I want the iterator to produce:

9:00, "Check in"; 10:30, "coffee"; 12:00, "Break"; 12:15, "baked beans"; ...

回答1:

As I announced, I have got something pretty cool.

I'm posting it now, because I wouldn't be sure whether I'd be back in time tonight to post it. I will be spending a few words in explanation. (in this post)

PS. The includes will be trimmed down (to about 20%); I will probably do some more general work on the code too.

A lot can be said about this code: it is not very efficient, and not very clean (yet). It is, however, nearly infinitely generic and should scale like anything else. All code can be found in a github gist:

  • merge_maps_iterator.hpp
  • Makefile
  • test.cpp - a rather arcane set of test-cases showing off the genericity
    (I'm not saying that it would be a good idea to have maps keyed with ints and floats (let alone both at the same time) - just showing that it can be done)

Here is the output of the test.cpp as you can find it:

 == input ========================================
{ 2, aap }      { 23, mies }    { 100, noot }   { 101, broer }  
{ b, 3.14 }     
 == output =======================================
     2: aap;
    23: mies;
    98: 3.14;
   100: noot;
   101: broer;

 == input ========================================
{ b, 3.14 }     
{ 2, aap }      { 23, mies }    { 100, noot }   { 101, broer }  
 == output =======================================
     2: aap;
    23: mies;
    98: 3.14;
   100: noot;
   101: broer;

 == input ========================================
{ 2, aap }      { 23, mies }    { 100, noot }   { 101, broer }  
{ 2, aap }      { 23, mies }    { 100, noot }   { 101, broer }  
 == output =======================================
     2: aap;aap;
    23: mies;mies;
   100: noot;noot;
   101: broer;broer;

 == input ========================================
{ b, 3.14 }     
{ b, 3.14 }     
 == output =======================================
     b: 3.14;3.14;

 == input ========================================
{ 1.0, dag }    { 22.0, bye }   { 24.0, Tschüß }
{ 1, true }     { 22, false }   { 24, true }    
{ b, 3.14 }     
{ 2, aap }      { 23, mies }    { 100, noot }   { 101, broer }  
 == output =======================================
   1.0: dag;true;
   2.0: aap;
  22.0: bye;false;
  23.0: mies;
  24.0: Tschüß;true;
  98.0: 3.14;
 100.0: noot;
 101.0: broer;

 == input ========================================
{ 1.0, dag }    { 2.0, EXTRA }  { 22.0, bye }   { 24.0, Tschüß }
{ 1, true }     { 22, false }   { 24, true }    
{ b, 3.14 }     
{ 2, aap }      { 23, mies }    { 100, noot }   { 101, broer }  
 == output =======================================
   1.0: dag;true;
   2.0: EXTRA;aap;
  22.0: bye;false;
  23.0: mies;
  24.0: Tschüß;true;
  98.0: 3.14;
 100.0: noot;
 101.0: broer;


回答2:

There is a "polymap": Boost.MultiIndex.



回答3:

Either copying both mapS into a temporary, appending one to the other (in case you can modify them) or using a vector as a temporary with std::set_union and a custom comparator are the easiest alternative solutions.



回答4:

Here's how I would implement thiton's answer:

template <class container> class union_iterator
{
private:
    typedef std::pair<typename container::const_iterator, typename container::const_iterator> container_range;
    class container_range_compare
    {
    public:
        bool operator()(const container_range &lhs, const container_range &rhs) const
        {
            return typename container::value_compare()(*lhs.first, *rhs.first);
        }
    };

    std::priority_queue<container_range, container_range_compare> m_range_queue;
    container::const_iterator m_current_iterator;
    bool m_is_valid;

    void add_container(const container &cont)
    {
        add_container_range(std::make_pair(cont.begin(), cont.end()));
    }

    void add_container_range(const container_range &range)
    {
        if (range.first!=range.second)
        {
            m_range_queue.push(range);
        }
    }

public:
    union_iterator(const container &a): m_valid(false)
    {
        add_container(a);
    }

    bool next()
    {
        m_is_valid= false;

        if (!m_range_queue.empty())
        {
            container_range range= m_range_queue.pop();
            m_current_iterator= range.first;

            ++range.first;
            add_container_range(range);

            m_is_valid= true;
        }

        return m_is_valid;
    }

    typename const container::value_type &operator *() const
    {
        return *m_current_iterator;
    }

    typename const container::value_type *operator ->() const
    {
        return m_current_iterator.operator ->();
    }
};

It has slightly different usage than union_iterator<K, V> but it implements the basic idea. You can expand the constructor to accept multiple maps however you fit, and use it in a while (iterator.next()) loop instead of a for (...) loop.

EDIT: I simplified next() by doing all the popping and pushing at once. So now it's even simpler! (One could also expend some effort making it like a STL iterator, but that gets tedious.)



回答5:

Very simple solution using boost function_output_iterator:

typedef std::map< std::string, std::string > Map;
Map first_map, second_map;
... // fill maps
// iterate over maps union
std::merge(
            first_map.begin(), first_map.end(),
            second_map.begin(), second_map.end(),
            boost::make_function_output_iterator(
                []( const Map::value_type & pair )
                {
                    std::cout << 
                    "key = " << pair.first << 
                    "; value = " << pair.second << std::endl;
                }       
            ),
            first_map.value_comp()
    );

We can make this solution prettier by using boost::set_union (range version) instead of std::set_union.

UPD Updated version use different key/values types:

typedef std::map< int, char > FirstMap;
typedef std::map< short, std::string > SecondMap;
FirstMap        first_map;
SecondMap       second_map;

... // fill maps

struct CustomOutput
{
    void operator()( const FirstMap::value_type & pair ) const
    {
        std::cout << "key = " << pair.first <<
        "; value = " << pair.second << std::endl;
    }

    void operator()( const SecondMap::value_type & pair ) const
    {
        std::cout << "key = " << pair.first <<
        "; value = " << pair.second << std::endl;
    }
};

struct CustomPred
{
    bool operator()( const FirstMap::value_type & first_pair, const SecondMap::value_type & second_pair ) const
    { return first_pair.first < second_pair.first; }

    bool operator()( const SecondMap::value_type & second_pair, const FirstMap::value_type & first_pair ) const
    { return second_pair.first < first_pair.first; }
};

// iterate over maps union
std::merge(
            first_map.begin(), first_map.end(),
            second_map.begin(), second_map.end(),
            boost::make_function_output_iterator( CustomOutput() ),
            CustomPred()
    );

UPD2 std::set_union replaced with std::merge



回答6:

Or is it easy to rig one up?

Rigging up should be fairly easy: For N base maps, your iterator contains a priority queue prioritized by the N keys of the elements the base iterators point to. For dereference, dereference the iterator at the queue front. For increment, increment the iterator at the queue front and, if it's increment is not at the end, re-insert it.



回答7:

Here's how it can be done quite easily:

template<class It>
class union_iterator
{
public:
  union_iterator(It it1_begin, It it1_end, It it2_begin, It it2_end)
     : current1(it1_begin), current2(it2_begin), end1(it1_end), end2(it2_end)
     { if (it1_begin != it1_end && it2_begin != it2_end) {
         if (*it1_begin < *it2_begin) { current= &current1; }
         else { current = &current2; }
       } else if (it1_begin==it1_end) { current=&current2; }
       else { current = &current1; }
     }
  void operator++() { 
    if (current1!=end1 && current2 !=end2) { 
       if (*current1 < *current2) 
         { ++current1; current = &current1; } 
         else { ++current2; current=&current2; } 
    } else if (current1==end1 && current2 != end2) {
       ++current2;
       current = &current2;
    } else if (current1!=end1 && current2 == end2) {
       ++current1;
       current = &current1;
    }
  }
  typename std::iterator<It1>::value_type operator*() { return **current; }
private:
  It current1;
  It current2;
  It end1;
  It end2;
  It *current;
};

But the real problem is implementing all the remaining member functions required by normal iterators :-). Boost has some lib for helping you do it, but it might still be quite difficult.



回答8:

This isn't an iterator like you asked for, but I just found this function in the standard library:

§ 25.4.5.2 set_union [set.union]

 template<class InputIterator1, class InputIterator2,
 class OutputIterator, class Compare>
 OutputIterator
 set_union(InputIterator1 first1, InputIterator1 last1,
 InputIterator2 first2, InputIterator2 last2,
 OutputIterator result, Compare comp);
  1. Effects: Constructs a sorted intersection of the elements from the two ranges; that is, the set of elements that are present in both of the ranges.
  2. Requires: The resulting range shall not overlap with either of the original ranges.
  3. Returns: The end of the constructed range.
  4. Complexity: At most 2 * ((last1 - first1) + (last2 - first2)) - 1 comparisons.
  5. Remarks: If [first1,last1) contains m elements that are equivalent to each other and [first2, last2) contains n elements that are equivalent to them, the first min(m, n) elements shall be copied from the first range to the output range, in order.

There's also a std::set_intersection, std::set_difference, and std::set_symmetric_difference



标签: c++ map iterator