什么是推力使用额外的数据字段中仿函数的最佳方式？(What is the optimal way t

什么是使用使用仿函数的一些常量数据正确（或最佳）的方式thrust算法，如thrust::transform ？用简单的方式我用简单地分配所需阵列中的算符的内部operator()方法，如下所示：

struct my_functor {

    __host__ __device__
    float operator()(thrust::tuple<float, float> args) {

        float A[2][10] = {
            { 4.0, 1.0, 8.0, 6.0, 3.0, 2.0, 5.0, 8.0, 6.0, 7.0 },
            { 4.0, 1.0, 8.0, 6.0, 7.0, 9.0, 5.0, 1.0, 2.0, 3.6 }};

        float x1 = thrust::get<0>(args);
        float x2 = thrust::get<1>(args);

        float result = 0.0;
        for (int i = 0; i < 10; ++i)
            result += x1 * A[0][i] + x2 * A[1][i];

        return result;
    }
}

但似乎不是很优雅和有效的方式。现在我有发展与在函子的使用的一些矩阵（恒定，如在上面的例子）和另外的方法相对复杂的仿函数operator()方法。什么是解决这个问题的最佳方式是什么？谢谢。

从你最后的评论，很显然，你真正问这里是仿函数参数初始化。 CUDA使用C ++的对象模型，所以结构具有类的语义和行为。所以，你的例子算符

struct my_functor {
    __host__ __device__
    float operator()(thrust::tuple<float, float> args) const {
        float A[2] = {50., 55.6};

        float x1 = thrust::get<0>(args);
        float x2 = thrust::get<1>(args);

        return x1 * A[0]+ x2 * A[1];
    }
}

可以重新编写与intialisation列出一个空的构造的函子到运行时分配的值范围内变换硬编码的常数：

struct my_functor {
    float A0, A1;

    __host__ __device__
    my_functor(float _a0, _a1) : A0(_a0), A1(_a1) { }

    __host__ __device__
    float operator()(thrust::tuple<float, float> args) const {
        float x1 = thrust::get<0>(args);
        float x2 = thrust::get<1>(args);

        return x1 * A0 + x2 * A1;
    }
}

您可以实例化函子尽可能多的版本，各有不同的恒定值，要做到不管它是你在与推力库一起使用仿函数进行。