Context: I have several loops in an Objective-C library I am writing which deal with processing large text arrays. I can see that right now it is running in a single threaded manner.
I understand that LLVM is now capable of auto-vectorising loops, as described at Apple's session at WWDC. It is however very cautious in the way it does it, one reason being the possibility of variables being modified due to CPU pipelining.
My question: how can I see where LLVM has vectorised my code, and, more usefully, how can I receive debug messages that explain why it can't vectorise my code? I'm sure if it can see why it can't auto-vectorise it, it could point that out to me and I could make the necessary manual adjustments to make it vectorisable.
I would be remiss if I didn't point out that this question has been more or less asked already, but quite obtusely, here.
The standard llvm toolchain provided by Xcode doesn't seem to support getting debug info from the optimizer. However, if you roll your own llvm and use that, you should be able to pass flags as mishr suggested above. Here's the workflow I used:
1. Using homebrew, install llvm
brew tap homebrew/versions
brew install llvm33 --with-clang --with-asan
This should install the full and relatively current llvm toolchain. It's linked into /usr/local/bin/*-3.3
(i.e. clang++-3.3
). The actual on-disk location is available via brew info llvm33
- probably /usr/local/Cellar/llvm33/3.3/bin
.
2. Build the single file you're optimizing, with homebrew llvm and flags
If you've built in Xcode, you can easily copy-paste the build parameters, and use your clang++-3.3 instead of Xcode’s own clang.
Appending -mllvm -debug-only=loop-vectorize
will get you the auto-vectorization report. Note: this will likely NOT work with any remotely complex build, e.g. if you've got PCH's, but is a simple way to tweak a single cpp file to make sure it's vectorizing correctly.
3. Create a compiler plugin from the new llvm
I was able to build my entire project with homebrew llvm by:
- Grabbing this Xcode compiler plugin: http://trac.seqan.de/browser/trunk/util/xcode/Clang%20LLVM%20MacPorts.xcplugin.zip?order=name
- Modifying the clang-related paths to point to my homebrew llvm and clang bin names (by appending '-3.3')
- Placing it in
/Library/Application Support/Developer/5.0/Xcode/Plug-ins/
Relaunching Xcode should show this plugin in the list of available compilers. At this point, the -mllvm -debug-only=loop-vectorize
flag will show the auto-vectorization report.
I have no idea why this isn't exposed in the Apple builds.
UPDATE: This is exposed in current (8.x) versions of Xcode. The only thing required is to enable one or more of the loop-vectorize
flags.
Identifies loops that were successfully vectorized
:
clang -Rpass=loop-vectorize
Identifies loops that failed vectorization and indicates if vectorization was specified:
clang -Rpass-missed=loop-vectorize
Identifies the statements that caused vectorization to fail:
clang -Rpass-analysis=loop-vectorize
Source: http://llvm.org/docs/Vectorizers.html#diagnostics
Assuming you are using opt
and you have a debug build of llvm, you can do it as follows:
opt -O1 -loop-vectorize -debug-only=loop-vectorize code.ll
where code.ll
is the IR you want to vectorize.
If you are using clang
, you will need to pass the -debug-only=loop-vectorize
flag using -mllvm
option.