I am wondering if it is possible to set the max GPU resources of a CUDA application? For example If I had a 4GB GPU but wanted a given application to only be able to access 2GB of it, and fail if it tries to allocate more.
Ideally this could either be set on a process level or on a CUDA context level.