I am trying to solve a problem using CUDA Thrust.
I have a host array with 3
elements. Is it possible, using Thrust, to create a device array of 384
elements in which the 3
elements in my host array is repeated 128
times (128 x 3 = 384
)?
Generally speaking, starting from an array of 3
elements, how can I use Thrust to generate a device array of size X
, where X = Y x 3
, i.e. Y
is the number of repetitions?
Robert Crovella has already answered this question using strided ranges. He has also pointed out the possibility of using the expand operator.
Below, I'm providing a worked example using the expand operator. Opposite to the use of strided ranges, it avoids the need of
for
loops.As an apparently simpler alternative to using CUDA Thrust, I'm posting below a worked example implementing in CUDA the classical Matlab's meshgrid function.
In Matlab
produces
and
X
is exactly the four-fold replication of thex
array, which is the OP's question and first guess of Robert Crovella's answer, whileY
is the three-fold consecutive replication of each element of they
array, which is the second guess of Robert Crovella's answer.Here is the code:
One possible approach:
This code is a trivial modification of the strided range example to demonstrate. You can change the
REPS
define to 128 to see the full expansion to 384 output elements: