Applying multiple tuples to the same function (i.e

2020-08-18 05:58发布

问题:

std::experimental::apply has the following signature:

template <class F, class Tuple>
constexpr decltype(auto) apply(F&& f, Tuple&& t);

It basically invokes f by expanding t's elements as the arguments.


I would like something that does the exact same thing, but with multiple tuples at the same time:

template <class F, class... Tuples>
constexpr decltype(auto) multi_apply(F&& f, Tuples&&... ts);

Example usage:

std::tuple t0{1, 2, 3};
std::tuple t1{4, 5, 6};
auto sum = [](auto... xs){ return (0 + ... + xs); };

assert(multi_apply(sum, t0, t1) == 1 + 2 + 3 + 4 + 5 + 6);

I can think of various naive way of implementing multi_apply:

  • Use std::tuple_cat and then calling std::experimental::apply.

  • Use recursion to bind every tuple's arguments to a series of lambdas that eventually call the original function.

But what I'm asking is: how can I implement multi_apply without resorting to std::tuple_cat or recursion?

What I ideally would like to do is: generate an std::index_sequence for every tuple and match every tuple with its own index sequence in the same variadic expansion. Is this possible? Example:

// pseudocode-ish
template <class F, std::size_t... Idxs, class... Tuples>
constexpr decltype(auto) multi_apply_helper(
    F&& f, std::index_sequence<Idxs>... seqs,  Tuples&&... ts)
{
    return f(std::get<Idxs>(ts)...);
} 

回答1:

Here's my take on it. It doesn't use recursion and it expands those tuples in the same pack expansion, but it requires a bit of preparation:

  • We build a tuple of references to the tuples passed in, rvalue references for rvalue arguments, lvalue references for lvalue arguments, in order to have proper forwarding in the final call (exactly what std::forward_as_tuple does, as T.C. noted in the comments). The tuple is built and passed around as an rvalue, so reference collapsing ensures correct value categories for each argument in the final call to f.
  • We build two flattened index sequences, both of size equal to the sum of all tuple sizes:
    • The outer indices select the tuple, so they repeat the same value (the tuple's index in the tuple pack) a number of times equal to the size of each tuple.
    • The inner ones select the element in each tuple, so they increase from 0 to one less than the tuple size for each tuple.

Once we have that in place, we just expand both index sequences in the call to f.

#include <tuple>
#include <array>
#include <cstddef>
#include <utility>
#include <type_traits>
#include <iostream>

template<std::size_t S, class... Ts> constexpr auto make_indices()
{
   constexpr std::size_t sizes[] = {std::tuple_size_v<std::remove_reference_t<Ts>>...};
   using arr_t = std::array<std::size_t, S>;
   std::pair<arr_t, arr_t> ret{};
   for(std::size_t c = 0, i = 0; i < sizeof...(Ts); ++i)
      for(std::size_t j = 0; j < sizes[i]; ++j, ++c)
      {
         ret.first[c] = i;
         ret.second[c] = j;
      }
   return ret;
}

template<class F, class... Tuples, std::size_t... OuterIs, std::size_t... InnerIs> 
constexpr decltype(auto) multi_apply_imp_2(std::index_sequence<OuterIs...>, std::index_sequence<InnerIs...>, 
                                           F&& f, std::tuple<Tuples...>&& t)
{
   return std::forward<F>(f)(std::get<InnerIs>(std::get<OuterIs>(std::move(t)))...);
}

template<class F, class... Tuples, std::size_t... Is> 
constexpr decltype(auto) multi_apply_imp_1(std::index_sequence<Is...>, 
                                           F&& f, std::tuple<Tuples...>&& t)
{
   constexpr auto indices = make_indices<sizeof...(Is), Tuples...>();
   return multi_apply_imp_2(std::index_sequence<indices.first[Is]...>{}, std::index_sequence<indices.second[Is]...>{},
      std::forward<F>(f), std::move(t));
}

template<class F, class... Tuples> 
constexpr decltype(auto) multi_apply(F&& f, Tuples&&... ts)
{
   constexpr std::size_t flat_s = (0U + ... + std::tuple_size_v<std::remove_reference_t<Tuples>>);
   if constexpr(flat_s != 0)
      return multi_apply_imp_1(std::make_index_sequence<flat_s>{}, 
         std::forward<F>(f), std::forward_as_tuple(std::forward<Tuples>(ts)...));
   else
      return std::forward<F>(f)();
}

int main()
{
   auto t0 = std::make_tuple(1, 2);
   auto t1 = std::make_tuple(3, 6, 4, 5);
   auto sum = [](auto... xs) { return (0 + ... + xs); };

   std::cout << multi_apply(sum, t0, t1, std::make_tuple(7)) << '\n';
}

It compiles on the trunk versions of Clang and GCC in C++1z mode. In terms of generated code, GCC with -O2 optimizes the call to multi_apply to a constant 28.


Replacing std::array with a built-in array inside make_indices by using arr_t = std::size_t[S]; makes it compile on Clang 3.9.1 (that version of libc++ lacks constexpr on std::array's operator[]).

Further replacing std::tuple_size_v with std::tuple_size<X>::value and removing the if constexpr test in multi_apply makes it compile on GCC 6.3.0. (The test handles the cases when no tuples are passed in or all tuples passed in are empty.)

Further replacing the uses of fold expressions with calls like

sum_array({std::tuple_size_v<std::remove_reference_t<Tuples>>...})

where sum_array can be something simple like

template<class T, std::size_t S> constexpr T sum_array(const T (& a)[S], std::size_t i = 0)
{
   return i < S ? a[i] + sum_array(a, i + 1) : 0;
}

makes it compile on the latest MSVC 2017 RC (MSVC actually has std::tuple_size_v, but it needs the other changes). The generated code is still great: after replacing the body of the sum lambda with sum_array({xs...}), the resulting code is a direct call to sum_array with the array built in-place directly from the elements of all tuples, so the multi_apply machinery doesn't introduce any run time overhead.


std::apply is defined in terms of INVOKE, so, to keep things consistent, the final call to f should be

std::invoke(std::forward<F>(f), std::get<InnerIs>(std::get<OuterIs>(std::move(t)))...)

Implementations may provide a noexcept-specifier on std::apply (at least, libc++ does; libstdc++ and MSVC currently don't) so that may be worth considering too.



回答2:

Alternative version:

template <class F, std::size_t... Is, class ... Ts>
constexpr decltype(auto) multiple_apply_impl(F&& f, std::index_sequence<Is...>, Ts&&... ts)
{
    constexpr auto p = [](){
        constexpr auto total_size = sizeof...(Is);
        std::array<std::size_t, total_size> outer{};
        std::array<std::size_t, total_size> inner{};
        std::size_t global_index = 0;
        std::size_t outer_value = 0;

        [[maybe_unused]] auto l = [&](std::size_t size)
        {
            for (std::size_t i = 0; i != size; ++i) {
                outer[global_index] = outer_value;
                inner[global_index] = i;
                ++global_index;
            }
            ++outer_value;
        };
        (l(std::tuple_size<std::decay_t<Ts>>::value), ...);

        return make_pair(outer, inner);
    }();
    [[maybe_unused]] constexpr auto outer = p.first;
    [[maybe_unused]] constexpr auto inner = p.second;

    using std::get;
    return std::invoke(std::forward<F>(f),
                       get<inner[Is]>(get<outer[Is]>(std::forward_as_tuple(std::forward<Ts>(ts)...)))...);
}

template <class F, class ... Ts>
constexpr decltype(auto) multiple_apply(F&& f, Ts&&... ts)
{
    constexpr auto total_size = (std::size_t{0} + ... + std::tuple_size<std::decay_t<Ts>>::value);

    return multiple_apply_impl(std::forward<F>(f),
                               std::make_index_sequence<total_size>(),
                               std::forward<Ts>(ts)...);
}

Demo