I am working towards adding a new memory type similar to __shared__
in CUDA called __noc__
which needs to be compiled using clang-llvm. Following are the steps followed to achieve the parsing for the new memory type taking reference from the answer:
Step 1: In the clangs's Attr.td file (clang/include/clang/Basic/Attr.td), the noc keyword was added similar to shared keyword.
def CUDAShared : InheritableAttr {
let Spellings = [GNU<"shared">];
let Subjects = SubjectList<[Var]>;
let LangOpts = [CUDA];
let Documentation = [Undocumented];
}
def CUDANoc : InheritableAttr {
let Spellings = [Keyword<"noc">];
let Subjects = SubjectList<[Var]>;
let LangOpts = [CUDA];
let Documentation = [Undocumented];
}
Step 2: Similar to CUDASharedAttr
, CUDANocAttr
was added in clang/lib/Sema/SemaDeclAttr.cpp.
case AttributeList::AT_CUDAShared:
handleSimpleAttribute<CUDASharedAttr>(S, D, Attr);
break;
case AttributeList::AT_CUDANoc:
handleSimpleAttribute<CUDANocAttr>(S, D, Attr);
printf("\n T1:SemaDeclAttr.cpp"); //testpoint 1 : for noc debugging
break;
Step 3: In the SemaDecl.cpp file, the CUDANocAttr
is added to enforce noc to be a static storage (similar to shared)
if (getLangOpts().CUDA) {
if (EmitTLSUnsupportedError && DeclAttrsMatchCUDAMode(getLangOpts(), NewVD))
Diag(D.getDeclSpec().getThreadStorageClassSpecLoc(),
diag::err_thread_unsupported);
// CUDA B.2.5: "__shared__ and __constant__ variables have implied static
// storage [duration]."
if (SC == SC_None && S->getFnParent() != nullptr &&
(NewVD->hasAttr<CUDASharedAttr>() ||
NewVD->hasAttr<CUDANocAttr>()||
NewVD->hasAttr<CUDAConstantAttr>())) {
NewVD->setStorageClass(SC_Static);
}
}
Step 4: NOC is added in CodeGenModule (llvm/tools/clang/lib/CodeGen/CodeGenModule.cpp) to allow the access of cuda_noc
address space from NVPTXAddrSpaceMap
else if (D->hasAttr<CUDASharedAttr>())
AddrSpace = getContext().getTargetAddressSpace(LangAS::cuda_shared);
else if (D->hasAttr<CUDANocAttr>())
AddrSpace = getContext().getTargetAddressSpace(LangAS::cuda_noc);
else
AddrSpace = getContext().getTargetAddressSpace(LangAS::cuda_device);
}
return AddrSpace;
}
Step 5: The cuda_noc
is added to NVPTXAddrSpaceMap
array to allow a new type of address space
static const unsigned NVPTXAddrSpaceMap[] = {
1, // opencl_global
3, // opencl_local
4, // opencl_constant
// FIXME: generic has to be added to the target
0, // opencl_generic
1, // cuda_device
4, // cuda_constant
3, // cuda_shared
6, // cuda_noc
};
Step 6: The macro #define __noc__ __location__(noc)
is added to the file clang/lib/Headers/__clang_cuda_runtime_wrapper.h where host_defines.h from CUDA is included.
The llvm source code got compiled and installed successfully. But when trying to compile a CUDA source file with the a memory type __noc__, it give the following warning:
warning: unknown attribute 'noc' ignored [-Wunknown-attributes]
__noc__ int c_shared[5];
^
/usr/local/bin/../lib/clang/3.8.0/include/__clang_cuda_runtime_wrapper.h:69:30: note: expanded from macro '__noc__'
#define __noc__ __location__(noc)
^
1 warning generated.
From the warnings it can be observed that __noc__
is ignored. In the IR generated, addrspace(6)
which is corresponding to __noc__
is not observed.
From the debug printf put into the file clang/lib/Sema/SemaDeclAttr.cpp (step 2), it can be observed that the case for AttributeList::AT_CUDANoc
is not getting executed.
Any suggestions or intuitions can greatly help. Is there any script to be run explicitly before compiling the llvm source code for the inputs in the *.td file to appear as C++ source code...