How to test AVX-512 instructions w/o supported har

2019-07-17 04:42发布

问题:

I'm trying to learn x86-64's new AVX-512 instructions, but neither of my computers have support for them. I tried using various disassemblers (from Visual Studio to online ones: 1, 2) to see the instructions for specific opcode encodings, but I'm getting somewhat conflicting results. Plus, it would've been nice to run some instructions and see their actual output.

So I'm wondering if there is an online service that allows to compile small (x86-64) assembly code and run it, or step through it, on a specific processor? (Say, Intel's Sandy Bridge, Cannon Lake, etc.)

回答1:

Use Intel® Software Development Emulator, aka SDE to run an executable on an emulated CPU that supports future instruction-sets. It's freeware (not open source, but a free download), and is available for Linux, Windows, and I think also OS X.

https://software.intel.com/en-us/articles/debugging-applications-with-intel-sde has step-by-step instructions for how to debug with it on Windows or Linux: SDE can work as a GDB remote, so you can run sde -debug -- ./your-program, then in another terminal run gdb ./your-program and use target remote :portnumber to connect to the SDE process so you can set breakpoints and single-step.


You might be able to do the same thing with QEMU, if they've added support for emulating AVX512. QEMU can also act as a GDB remote.

QEMU definitely has configurable instruction-set stuff, e.g. you could tell it to emulate an x86 with AVX but not AVX2 (like Sandybridge.) SDM can probably do the same thing.

You could even tell it to emulate something you won't find on real hardware, like AVX2 but not BMI1/2, if you want to verify that your CPUID checks don't assume anything implies anything else that isn't guaranteed.


Remember that these are both essentially useless for performance testing, only for correctness of your vectorization. IACA could be useful to get an idea of performance on SKX, but it's far from perfect and doesn't model memory bottlenecks at all. (Only the actual pipeline in some level of detail.)



回答2:

There are online tools which allow you to at least select different assembly dialects, but I'm not seeing anything that supports Xeon Phi or Skylake. However, the Intel C++ and Fortran compilers support cross-compiling for those additional architectures. It seems you're using Windows, and that is directly supported.

An additional route would include renting an AWS EC2 C5 instance to play with which natively supports AVX-512. For learning purposes, this can be done for as little as $0.085/hr for a reserved instance or $0.0185/hr if you're fine with Spot pricing.