Using pointers in C++Amp

2019-08-05 09:59发布

问题:

I've got a following issue:

I have a code which does a very basic operation. I am passing a pointer to a concurrency::array_view because I wanted to store the values earlier to avoid the bottle-neck in the function which uses multithreading. The problem is the following construction won't compile:

parallel_for_each((*pixels).extent, [=](concurrency::index<2> idx) restrict(amp)
{
int row=idx[0];
int col=idx[1]; 

    (*pixels)(row, col) = (*pixels)(row, col) * (*taps)(row, col); //this is the problematic place 
});

Does anybody know how to solve this case? I really need to prepare the data before running the method so it's the only way to do it like this because I cannot afford spending time on copying the data between RAM and accelerator's memory.

//EDIT:

After solving some issues with the header files, I am left with following problem:

parallel_for_each((*pixels).extent, [=](concurrency::index<2> idx) restrict(amp)
{
int row=idx[0];
int col=idx[1]; 
});

The code above doesn't work (it gives exception). Is there ANY way to prepare the data earlier so for example the constructor of the class can handle the copying it just one time? I really need to have a pointer to array_view in my header file and initialize it in the constructor as follows:

in cci_subset.h:

concurrency::array_view<float, 2> *pixels, *taps; 

and in the subset.cpp:

concurrency::array_view<float, 2> pixels(4, 4, pixel_array); 
... 
concurrency::array_view<float, 2> taps(4, 4, myTap4Kernel_array); 

//EDIT 2:

I found out that the parameters for parallel_for_each can be only passed by value. That's why I am still looking for a way to copy the values from CPU to GPU when intializing the class or passing some arguments (i.e. image data) to the class.

回答1:

Your C++ AMP issue

C++ AMP supports two core data types for referencing data on the GPU

An array represents data on an accelerator. You can construct it and fill it with data in a single step or construct it and fill it with data later. In either case, after some calculations have been performed on it, you will almost certainly copy the results from an array back to the CPU so that you can use them in some other part of your application.

You can certainly write useful applications using only arrays, but C++ AMP also offers the array_view, which supports features that often make it more convenient than working directly with arrays. An array_view looks like an array to the accelerator, but it saves you the trouble of arranging to copy the data to and from the accelerator.

The relationship between an array_view and an array is somewhat (but not precisely) like that between a reference and the object it refers to. Like a reference, array views must be initialized when they are created. Also as with a reference, changing the array_view changes (eventually) the data it was created from. However, the reverse is not true: changing the data from which the array_view was created might not automatically change the array_view, so you should approach such operations with care.

From: C++ AMP: Accelerated Massive Parallelism

I don't think your use of pixels is the problem per-se. You cannot use globally scoped variables within a C++ AMP lambda, period. There is no way around this. The C++ AMP code is executing on a device with a different memory space.

You can however initialize your array or array_view objects earlier in a separate method or constructor and then pass them to the function that does all the work. The following code does something along these lines. m_frames is an array of pointers to (C++ AMP) array objects that are declared as pare of the class and then initialized in ConfigureFrameBuffers.

Note that it uses STL smart pointers, something I would strongly recommend over raw pointers.

class FrameProcessorAmpBase
{
private:
    std::shared_ptr<array<float, 2> m_frame;

public:
    FrameProcessorAmpBase()
    {
    }

void ConfigureFrameBuffers(int width, int height)
{
    m_frame = std::make_shared<array<float, 2>>(height, width)); 
}

Your min/max header issue

This is probably because you are including windef.h, or something that takes a dependency on it. This is a known issue when mixing STL and Windows headers. The way to "fix" it is to define NOMINMAX at the top of your file before any other includes and then use the min/max functions declared by STL or AMP (which also defines min/max for use in restrict(amp) lambdas).

#define NOMINMAX

If you are using GDI you'll have issues there too as it requires the Windows MIN/MAX macros.

I wrap GIDPlus.h in a wrapper header that contains the following:

#pragma once

#define NOMINMAX
#include <algorithm>
#ifndef max
    #define min std::min
#endif
#ifndef min
    #define max std::max
#endif
#include <gdiplus.h>
#undef max 
#undef min