std::experimental::apply
has the following signature:
template <class F, class Tuple>
constexpr decltype(auto) apply(F&& f, Tuple&& t);
It basically invokes f
by expanding t
's elements as the arguments.
I would like something that does the exact same thing, but with multiple tuples at the same time:
template <class F, class... Tuples>
constexpr decltype(auto) multi_apply(F&& f, Tuples&&... ts);
Example usage:
std::tuple t0{1, 2, 3};
std::tuple t1{4, 5, 6};
auto sum = [](auto... xs){ return (0 + ... + xs); };
assert(multi_apply(sum, t0, t1) == 1 + 2 + 3 + 4 + 5 + 6);
I can think of various naive way of implementing multi_apply
:
But what I'm asking is: how can I implement multi_apply
without resorting to std::tuple_cat
or recursion?
What I ideally would like to do is: generate an std::index_sequence
for every tuple and match every tuple with its own index sequence in the same variadic expansion. Is this possible? Example:
// pseudocode-ish
template <class F, std::size_t... Idxs, class... Tuples>
constexpr decltype(auto) multi_apply_helper(
F&& f, std::index_sequence<Idxs>... seqs, Tuples&&... ts)
{
return f(std::get<Idxs>(ts)...);
}
Here's my take on it. It doesn't use recursion and it expands those tuples in the same pack expansion, but it requires a bit of preparation:
- We build a tuple of references to the tuples passed in, rvalue references for rvalue arguments, lvalue references for lvalue arguments, in order to have proper forwarding in the final call (exactly what
std::forward_as_tuple
does, as T.C. noted in the comments). The tuple is built and passed around as an rvalue, so reference collapsing ensures correct value categories for each argument in the final call to f
.
- We build two flattened index sequences, both of size equal to the sum of all tuple sizes:
- The outer indices select the tuple, so they repeat the same value (the tuple's index in the tuple pack) a number of times equal to the size of each tuple.
- The inner ones select the element in each tuple, so they increase from
0
to one less than the tuple size for each tuple.
Once we have that in place, we just expand both index sequences in the call to f
.
#include <tuple>
#include <array>
#include <cstddef>
#include <utility>
#include <type_traits>
#include <iostream>
template<std::size_t S, class... Ts> constexpr auto make_indices()
{
constexpr std::size_t sizes[] = {std::tuple_size_v<std::remove_reference_t<Ts>>...};
using arr_t = std::array<std::size_t, S>;
std::pair<arr_t, arr_t> ret{};
for(std::size_t c = 0, i = 0; i < sizeof...(Ts); ++i)
for(std::size_t j = 0; j < sizes[i]; ++j, ++c)
{
ret.first[c] = i;
ret.second[c] = j;
}
return ret;
}
template<class F, class... Tuples, std::size_t... OuterIs, std::size_t... InnerIs>
constexpr decltype(auto) multi_apply_imp_2(std::index_sequence<OuterIs...>, std::index_sequence<InnerIs...>,
F&& f, std::tuple<Tuples...>&& t)
{
return std::forward<F>(f)(std::get<InnerIs>(std::get<OuterIs>(std::move(t)))...);
}
template<class F, class... Tuples, std::size_t... Is>
constexpr decltype(auto) multi_apply_imp_1(std::index_sequence<Is...>,
F&& f, std::tuple<Tuples...>&& t)
{
constexpr auto indices = make_indices<sizeof...(Is), Tuples...>();
return multi_apply_imp_2(std::index_sequence<indices.first[Is]...>{}, std::index_sequence<indices.second[Is]...>{},
std::forward<F>(f), std::move(t));
}
template<class F, class... Tuples>
constexpr decltype(auto) multi_apply(F&& f, Tuples&&... ts)
{
constexpr std::size_t flat_s = (0U + ... + std::tuple_size_v<std::remove_reference_t<Tuples>>);
if constexpr(flat_s != 0)
return multi_apply_imp_1(std::make_index_sequence<flat_s>{},
std::forward<F>(f), std::forward_as_tuple(std::forward<Tuples>(ts)...));
else
return std::forward<F>(f)();
}
int main()
{
auto t0 = std::make_tuple(1, 2);
auto t1 = std::make_tuple(3, 6, 4, 5);
auto sum = [](auto... xs) { return (0 + ... + xs); };
std::cout << multi_apply(sum, t0, t1, std::make_tuple(7)) << '\n';
}
It compiles on the trunk versions of Clang and GCC in C++1z mode. In terms of generated code, GCC with -O2
optimizes the call to multi_apply
to a constant 28
.
Replacing std::array
with a built-in array inside make_indices
by using arr_t = std::size_t[S];
makes it compile on Clang 3.9.1 (that version of libc++ lacks constexpr
on std::array
's operator[]
).
Further replacing std::tuple_size_v
with std::tuple_size<X>::value
and removing the if constexpr
test in multi_apply
makes it compile on GCC 6.3.0. (The test handles the cases when no tuples are passed in or all tuples passed in are empty.)
Further replacing the uses of fold expressions with calls like
sum_array({std::tuple_size_v<std::remove_reference_t<Tuples>>...})
where sum_array
can be something simple like
template<class T, std::size_t S> constexpr T sum_array(const T (& a)[S], std::size_t i = 0)
{
return i < S ? a[i] + sum_array(a, i + 1) : 0;
}
makes it compile on the latest MSVC 2017 RC (MSVC actually has std::tuple_size_v
, but it needs the other changes). The generated code is still great: after replacing the body of the sum
lambda with sum_array({xs...})
, the resulting code is a direct call to sum_array
with the array built in-place directly from the elements of all tuples, so the multi_apply
machinery doesn't introduce any run time overhead.
std::apply
is defined in terms of INVOKE, so, to keep things consistent, the final call to f
should be
std::invoke(std::forward<F>(f), std::get<InnerIs>(std::get<OuterIs>(std::move(t)))...)
Implementations may provide a noexcept-specifier on std::apply
(at least, libc++ does; libstdc++ and MSVC currently don't) so that may be worth considering too.
Alternative version:
template <class F, std::size_t... Is, class ... Ts>
constexpr decltype(auto) multiple_apply_impl(F&& f, std::index_sequence<Is...>, Ts&&... ts)
{
constexpr auto p = [](){
constexpr auto total_size = sizeof...(Is);
std::array<std::size_t, total_size> outer{};
std::array<std::size_t, total_size> inner{};
std::size_t global_index = 0;
std::size_t outer_value = 0;
[[maybe_unused]] auto l = [&](std::size_t size)
{
for (std::size_t i = 0; i != size; ++i) {
outer[global_index] = outer_value;
inner[global_index] = i;
++global_index;
}
++outer_value;
};
(l(std::tuple_size<std::decay_t<Ts>>::value), ...);
return make_pair(outer, inner);
}();
[[maybe_unused]] constexpr auto outer = p.first;
[[maybe_unused]] constexpr auto inner = p.second;
using std::get;
return std::invoke(std::forward<F>(f),
get<inner[Is]>(get<outer[Is]>(std::forward_as_tuple(std::forward<Ts>(ts)...)))...);
}
template <class F, class ... Ts>
constexpr decltype(auto) multiple_apply(F&& f, Ts&&... ts)
{
constexpr auto total_size = (std::size_t{0} + ... + std::tuple_size<std::decay_t<Ts>>::value);
return multiple_apply_impl(std::forward<F>(f),
std::make_index_sequence<total_size>(),
std::forward<Ts>(ts)...);
}
Demo