With VMs being slave to whatever the host machine is providing, what compiler flags should be provided to gcc?
I would normally think that -march=native
would be what you would use when compiling for a dedicated box, but the fine detail that -march=native
is going to as indicated in this article makes me extremely wary of using it.
So... what to set -march
and -mtune
to inside a VM?
For a specific example...
My specific case right now is compiling python (and more) in a linux guest inside a KVM-based "cloud" host that I have no real control over the host hardware (aside from 'simple' stuff like CPU GHz m CPU count, and available RAM). Currently, cpuinfo
tells me I've got an "AMD Opteron(tm) Processor 6176" but I honestly don't know (yet) if that is reliable and whether the guest can get moved around to different architectures on me to meet the host's infrastructure shuffling needs (sounds hairy/unlikely).
All I can really guarantee is my OS, which is a 64-bit linux kernel where uname -m
yields x86_64
.
Some incomplete and out of order excerpts from section 3.17.14 Intel 386 and AMD x86-64 Options of the GCC 4.6.3 Standard C++ Library Manual (which I hope are pertinent).
What I found most interesting is that
specifying -march=cpu-type implies -mtune=cpu-type
. My take on the rest was that if you are specifying both-march
&-mtune
you're probably getting too close to tweak overkill.My suggestion would be to just use
-m64
and you should be safe enough since you're running inside a x86-64 Linux, correct?But if you don't need to run in another environment and you're feeling lucky and fault tolerant then-march=native
might also work just fine for you.For what it's worth ...
Out of curiosity I tried using the technique described in the article you referenced. I tested gcc v4.6.3 in 64-bit Ubuntu 12.04 which was running as a VMware Player guest. The VMware VM was running in Windows 7 on a desktop using an Intel Pentium Dual-Core E6500 CPU.
The gcc option
-m64
was replaced with just-march=x86-64 -mtune=generic
.However, compiling with
-march=native
resulted in gcc using all of the much more specific compiler options below.So, yes, as the gcc documentation states when "Using -march=native ... the result might not run on different machines". To play it safe you should probably only use
-m64
or it's apparent equivalent-march=x86-64 -mtune=generic
for your compiles.I can't see how you would have any problem with this since the intent of those compiler options are that gcc will produce code capable of running correctly on any x86-64/amd64 compliant CPU. (No?)
I am frankly astounded at how specific the gcc
-march=native
CPU options turned out to be. I have no idea how a CPU's L1 cache size being 32k could be used to fine tune the generated code. But apparently if there is a way to do this, then using-march=native
will allow gcc to do it.I wonder if this might result in any noticeable performance improvements?
One would like the think that the CPU architecture reported by the guest OS is what you should optimize for. Otherwise, I'd call it a bug. There can be decent reasons for bugs sometimes, but...
Note that not all hypervisors will necessarily be the same.
It might be a good idea to check on a mailing list for your specific hypervisor.