Adding support for a memory type similar to __shar

2019-07-16 20:23发布

问题:

I am working towards adding a new memory type similar to __shared__ in CUDA called __noc__ which needs to be compiled using clang-llvm. Following are the steps followed to achieve the parsing for the new memory type taking reference from the answer:

Step 1: In the clangs's Attr.td file (clang/include/clang/Basic/Attr.td), the noc keyword was added similar to shared keyword.

def CUDAShared : InheritableAttr {
  let Spellings = [GNU<"shared">];
  let Subjects = SubjectList<[Var]>;
  let LangOpts = [CUDA];
  let Documentation = [Undocumented];
}

def CUDANoc : InheritableAttr {
  let Spellings = [Keyword<"noc">];
  let Subjects = SubjectList<[Var]>;
  let LangOpts = [CUDA];
  let Documentation = [Undocumented];
}

Step 2: Similar to CUDASharedAttr, CUDANocAttr was added in clang/lib/Sema/SemaDeclAttr.cpp.

  case AttributeList::AT_CUDAShared:
    handleSimpleAttribute<CUDASharedAttr>(S, D, Attr);
    break;
  case AttributeList::AT_CUDANoc:
    handleSimpleAttribute<CUDANocAttr>(S, D, Attr);
    printf("\n T1:SemaDeclAttr.cpp"); //testpoint 1 : for noc debugging
    break;

Step 3: In the SemaDecl.cpp file, the CUDANocAttr is added to enforce noc to be a static storage (similar to shared)

  if (getLangOpts().CUDA) {
    if (EmitTLSUnsupportedError && DeclAttrsMatchCUDAMode(getLangOpts(), NewVD))
      Diag(D.getDeclSpec().getThreadStorageClassSpecLoc(),
           diag::err_thread_unsupported);
    // CUDA B.2.5: "__shared__ and __constant__ variables have implied static
    // storage [duration]."
    if (SC == SC_None && S->getFnParent() != nullptr &&
        (NewVD->hasAttr<CUDASharedAttr>() ||
         NewVD->hasAttr<CUDANocAttr>()||
         NewVD->hasAttr<CUDAConstantAttr>())) {
      NewVD->setStorageClass(SC_Static);
    }
  }

Step 4: NOC is added in CodeGenModule (llvm/tools/clang/lib/CodeGen/CodeGenModule.cpp) to allow the access of cuda_noc address space from NVPTXAddrSpaceMap

    else if (D->hasAttr<CUDASharedAttr>())
      AddrSpace = getContext().getTargetAddressSpace(LangAS::cuda_shared);
    else if (D->hasAttr<CUDANocAttr>())
      AddrSpace = getContext().getTargetAddressSpace(LangAS::cuda_noc);
    else
      AddrSpace = getContext().getTargetAddressSpace(LangAS::cuda_device);
  }

  return AddrSpace;
}

Step 5: The cuda_noc is added to NVPTXAddrSpaceMap array to allow a new type of address space

static const unsigned NVPTXAddrSpaceMap[] = {
    1, // opencl_global
    3, // opencl_local
    4, // opencl_constant
    // FIXME: generic has to be added to the target
    0, // opencl_generic
    1, // cuda_device
    4, // cuda_constant
    3, // cuda_shared
    6, // cuda_noc
};

Step 6: The macro #define __noc__ __location__(noc) is added to the file clang/lib/Headers/__clang_cuda_runtime_wrapper.h where host_defines.h from CUDA is included.

The llvm source code got compiled and installed successfully. But when trying to compile a CUDA source file with the a memory type __noc__, it give the following warning:

warning: unknown attribute 'noc' ignored [-Wunknown-attributes]
        __noc__ int c_shared[5];
        ^

/usr/local/bin/../lib/clang/3.8.0/include/__clang_cuda_runtime_wrapper.h:69:30: note: expanded from macro '__noc__'
#define __noc__ __location__(noc)
                             ^
1 warning generated.

From the warnings it can be observed that __noc__ is ignored. In the IR generated, addrspace(6) which is corresponding to __noc__ is not observed.

From the debug printf put into the file clang/lib/Sema/SemaDeclAttr.cpp (step 2), it can be observed that the case for AttributeList::AT_CUDANoc is not getting executed.

Any suggestions or intuitions can greatly help. Is there any script to be run explicitly before compiling the llvm source code for the inputs in the *.td file to appear as C++ source code...

回答1:

__location__(noc) gets expanded to __attribute__((noc)). This is the GNU or gcc attribute syntax. So the issue is with this line:

let Spellings = [Keyword<"noc">];

In order for noc to work with __location__ macro, you should use GNU<"noc"> instead of Keyword<"noc">.