Just to be sure: Is LLVM bitcode cross-platform? By which I mean, can the generated IR (".bc") file be distrubuted and interpreted/JITed over various platforms?
If so, how does Clang convert C++ into platform independend code? While in the C++ language itself, preprocessors for determining it's target platform are used before it actually compiles.
LLVM IR can be cross-platform, with the obvious exceptions others have listed. However, that does not mean Clang generates cross-platform code. As you note, the preprocessor is almost universally used to only pass parts of the code to the C/C++ compiler, depending on the platform. Even when this is not done in user code, many system headers include a bit or two that's platform-specific, such as typedef
s. For example, if you compile C code using size_t
to LLVM IR on a platform where size_t
is 32 bit, the LLVM IR now uses i32
for that, and there's no way in hell you can reverse engineer that to fix it.
Google's Portable Native Client project (thanks @willglynn for the link), if I understand it correctly, achieves portability by fixing the ABI for all target platforms. So in that sense, it doesn't solve the aforementioned issues: The LLVM IR is not portable to platform with a different ABI. The only reason this is more portable is that the clients provide a layer which matches the PNaCl ABI to the actual ABI. In other words, PNaCl code isn't portable to many platforms, the "PNaCl VM" is.
So, bottom line: If you're very careful, you can use LLVM IR across multiple platforms, but not without doing significant additional work (which Clang doesn't do) to abstract over the ABI differences.
Given an IR file, can I be sure it could compile to my target?
You can not assume an arbitrary IR file will always be cross-platform, as there are things in a given file that might not be platform-independent. The most notable example is that the IR can contain actual assembler sequences (via module-level or inline assembly segments), but there are other examples - e.g. usage of target specific intrinsics or calling conventions that are only supported on some targets.
Can I generate an IR file that is guaranteed to compile on all targets?
I don't know, but I believe you can, especially if you avoid specifying things like inline assembly, calling conventions, required / preferred ABI for types, etc. It can affect the optimizations the compiler will perform, though.