What I want to do: I want to sort 2, or 3, or N vectors, locked together, without copying them into a tuple. That is, leaving verbosity aside, something like:
vector<int> v1 = { 1, 2, 3, 4, 5};
vector<double> v2 = { 11, 22, 33, 44, 55};
vector<long> v3 = {111, 222, 333, 444, 555};
typedef tuple<int&,double&,long&> tup_t;
sort(zip(v1,v2,v3),[](tup_t t1, tup_t t2){ return t1.get<0>() > t2.get<0>(); });
for(auto& t : zip(v1,v2,v3))
cout << t.get<0>() << " " << t.get<1>() << " " << t.get<2>() << endl;
This should output:
5 55 555
4 44 444
...
1 11 111
How I am doing it right now: I have implemented my own quicksort, where the first array I pass is used for the comparison, and the permutations are applied to all other arrays. I just couldn't figure out how to reuse std::sort for my problem (e.g. extract permutations).
What I've tryed: boost::zip_iterator and boost::zip_range (with boost::combine range), but both std::sort and boost::range::algorithm::sort complain that the iterators/ranges are read only and not random access...
Question: How do I sort N vectors in lock step (zipped)? The problem looks pretty generic and common so I guess there must be an easy solution through a probably very complex library but I just can't find it...
Remarks: yes, there are similar questions in stackoverflow, this question gets asked a lot in different forms. However they are always closed with one of the following answers:
- Copy your vectors into a pair/tuple and sort that tuple...
- Copy your vectors into a struct with one member per vector and sort a vector of structs...
- Implement your own sorting function for your particular problem...
- Use an auxiliary array of indices...
- Use boost::zip_iterator without an example or with an example that produces bad results.
Hints:
- I've found this thread in the boost mailing list which points to this paper from Anthony Williams. Although this seems to only work for pairs, they also discus a TupleIteratorType but I haven't been able to find it.
- user673679 found this post containing a nice solution for the two container case. It also nails down the problem (the emphasis is mine):
[...] the fundamental problem is that "pairs" of array references do not behave like they should [...] I simply decided to abuse the notation of an iterator and write something that works. This involved writing, effectively, a non-conforming iterator where the reference of the value type is not the same as the reference type.
Answer: see comment by interjay below (this also partially answers the future question):
#include "tupleit.hh"
#include <vector>
#include <iostream>
#include <boost/range.hpp>
#include <boost/range/algorithm/sort.hpp>
#include <boost/range/algorithm/for_each.hpp>
template <typename... T>
auto zip(T&... containers)
-> boost::iterator_range<decltype(iterators::makeTupleIterator(std::begin(containers)...))> {
return boost::make_iterator_range(iterators::makeTupleIterator(std::begin(containers)...),
iterators::makeTupleIterator(std::end(containers)...));
}
int main() {
typedef boost::tuple<int&,double&,long&> tup_t;
std::vector<int> a = { 1, 2, 3, 4 };
std::vector<double> b = { 11, 22, 33, 44 };
std::vector<long> c = { 111, 222, 333, 444 };
auto print = [](tup_t t){ std::cout << t.get<0>() << " " << t.get<1>() << " " << t.get<2>() << std::endl; };
boost::for_each( zip(a, b, c), print);
boost::sort( zip(a, b, c), [](tup_t i, tup_t j){ return i.get<0>() > j.get<0>(); });
for ( auto tup : zip(a, b, c) ) print(tup);
return 0;
}
Future question: the previous answer works for sequence containers. Could we get it also to work on sortable containers (e.g. sequences and lists)? This would require random_access and bidirectional TupleIterators as well as a sort algorithm that works on bidirectional iterators.
Update: this works for combinations of sequence-like containers. However mixing a list would require that std::sort supported BidirectionalIterators (which does not).
Here's a working example based on the range-v3 Library that has been proposed for Standardization
#include <range/v3/all.hpp>
#include <iostream>
using namespace ranges;
int main()
{
std::vector<int> a1{15, 7, 3, 5};
std::vector<int> a2{ 1, 2, 6, 21};
sort(view::zip(a1, a2), std::less<>{}, &std::pair<int, int>::first);
std::cout << view::all(a1) << '\n';
std::cout << view::all(a2) << '\n';
}
Live Example (requires recent compiler with good C++14 support, not VS 2015).
For the two container case, here's a version that compiles on gcc 4.4.6, based on the paper referred to above. In later versions of gcc, you can swap out boost::tuple for std::tuple
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
# include <boost/iterator/iterator_facade.hpp>
# include <boost/tuple/tuple.hpp>
using namespace std;
template <class T, class T2>
struct helper_type {
typedef boost::tuple<typename iterator_traits<T>::value_type, typename iterator_traits<T2>::value_type> value_type;
typedef boost::tuple<typename iterator_traits<T>::value_type&, typename iterator_traits<T2>::value_type&> ref_type;
};
template <typename T1, typename T2>
class dual_iterator : public boost::iterator_facade<dual_iterator<T1, T2>,
typename helper_type<T1, T2>::value_type,
boost::random_access_traversal_tag,
typename helper_type<T1, T2>::ref_type> {
public:
explicit dual_iterator(T1 iter1, T2 iter2) : mIter1(iter1), mIter2(iter2) {}
typedef typename iterator_traits<T1>::difference_type difference_type;
private:
void increment() { ++mIter1; ++mIter2; }
void decrement() { --mIter1; --mIter2; }
bool equal(dual_iterator const& other) const { return mIter1 == other.mIter1; }
typename helper_type<T1, T2>::ref_type dereference() const { return (typename helper_type<T1, T2>::ref_type(*mIter1, *mIter2)); }
difference_type distance_to(dual_iterator const& other) const { return other.mIter1 - mIter1; }
void advance(difference_type n) { mIter1 += n; mIter2 += n; }
T1 mIter1;
T2 mIter2;
friend class boost::iterator_core_access;
};
template <typename T1, typename T2>
dual_iterator<T1, T2> make_iter(T1 t1, T2 t2) { return dual_iterator<T1, T2>(t1, t2); }
template <class T1, class T2> struct iter_comp {
typedef typename helper_type<T1, T2>::value_type T;
bool operator()(const T& t1, const T& t2) { return get<0>(t1) < get<0>(t2); }
};
template <class T1, class T2> iter_comp<T1, T2> make_comp(T1 t1, T2 t2) { return iter_comp<T1, T2>(); }
template<class T> void print(T& items) {
copy(items.begin(), items.end(), ostream_iterator<typename T::value_type>(cout, " ")); cout << endl;
}
int main() {
vector<double> nums1 = {3, 2, 1, 0};
vector<char> nums2 = {'D','C', 'B', 'A'};
sort(make_iter(nums1.begin(), nums2.begin()),
make_iter(nums1.end(), nums2.end()),
make_comp(nums1.begin(), nums2.begin()));
print(nums1);
print(nums2);
}
Create an auxillary array that contains the indices 0..N-1. Sort that array with a custom comparator that actually returns results from comparing elements of one of your primary arrays. Then use the sorted auxillary array to print out your primary arrays in the right order.
Nice to meet a fellow Internet Archaeologist !
How do I sort N vectors in lock step (zipped)? The problem looks
pretty generic and common so I guess there must be an easy solution
through a probably very complex library but I just can't find it.
Sometimes ago, I went on the same treasure hunt with similar assumptions...
Never found the treasure :(
I followed the same tracks as you :
- Go through the usual suspects boost.iterator/boost.range/boost.fusion/boost.oven and after quite a lot of experimentation and research realize that they can't solve this particular problem.
- Go through many question on SO, only to realize that every single one have been closed either with incorrect answer (for example recommending boost::zip_iterator which doesn't work in this case as you point out), or with some workaround that avoid the heart of the matter.
- Go through many blog post, mailing list, only to realize that nobody actually solved the problem except....
- After much research finally dig up the old codex by Antonius Wilhelm, who claim to have crafted a general solution "TupleIterator" and locked it inside some archive "tupleit.zip". Historical sources are so scarce on this one, that I'm still not sure if this archive is a myth, a legend, or if it still buried somewhere in a lost layer of the internet :)
Ok, more seriously, the paper by Anthony Williams suggest that the problem is actually really hard, so it's not that surprising to find that no existing library like boost solved it.
I'm pleased to say that I've found a solution, after a similar treasure hunt. range-v3 is a great idea if you can use it, but if you really need an iterator, the HPX project has created one and it works perfectly with sort.
Owing to an oversight, which will hopefully be repaired, it still requires you to link with the HPX library but that's OK for me because the whole point was to use C++17 parallel algorithms, which HPX provides an implementation of.
#include <hpx/util/zip_iterator.hpp>
using zip_it =
hpx::util::zip_iterator<std::vector<std::size_t>::iterator,
std::vector<std::size_t>::iterator,
std::vector<double>::iterator>;
int main() {
std::vector<std::size_t> rows{3, 2, 1};
std::vector<std::size_t> cols{1, 2, 3};
std::vector<double> values{4.0, 5.0, 6.0};
auto start = hpx::util::make_zip_iterator(rows.begin(), cols.begin(), values.begin());
auto stop = hpx::util::make_zip_iterator(rows.end(), cols.end(), values.end());
std::sort(start, stop);
for ( int i = 0; i < 3; ++i ) {
std::cerr << rows[i] << ", " << cols[i] << ", " << values[i] << "\n";
}
}