Assume that I am wait()
ing for the kernel to compute the work.
I was wondering if, when allocating a buffer using the CL_MEM_USE_HOST_PTR flag, it is necessary to use enqueueRead/Write on the buffer, or they can always be omitted?
Note
I am aware of this note on the reference:
Calling clEnqueueReadBuffer to read a region of the buffer object with the ptr argument value set to host_ptr + offset, where host_ptr is a pointer to the memory region specified when the buffer object being read is created with CL_MEM_USE_HOST_PTR, must meet the following requirements in order to avoid undefined behavior:
- All commands that use this buffer object have finished execution before the read command begins execution
- The buffer object is not mapped
- The buffer object is not used by any command-queue until the read command has finished execution
So, to clarify my question, I split it in two:
- if I create a buffer using CL_MEM_USE_HOST_PTR flag, can I assume the OpenCL implementation will write to device cache when necessary, so I can always avoid to
enqueueWriteBuffer()
? - if I call
event.wait()
after launching a kernel, can I always avoid toenqueueReadBuffer()
to access computed data on a buffer created with flag CL_MEM_USE_HOST_PTR?
Maybe I am overthinking about it, but even if the description of the flag is clear about the fact that the host memory will be used to store the data, it is not clear (or I did not find where it is cleared) about when data is available and if the read/write is always implicit.