How can I check if my installed version of numpy is compiled with SSE/SSE2 instruction set? I know that some parts of numpy is using BLAS, how to check BLAS too?
问题:
回答1:
Take a look at:
import numpy.distutils.system_info as sysinfo
sysinfo.show_all()
This will print out all of the information about what numpy was compiled against.
回答2:
I think that one way is to use objdump on a numpy.so file if you are under linux, and grep for instruction that are specific to sse.
for SSE3 (http://en.wikipedia.org/wiki/SSE3) :
objdump -d /usr/lib/pyshared/python2.7/numpy/core/*.so | grep -i MOVDDUP
for SSE2 (http://fr.wikipedia.org/wiki/Jeu_d%27instructions_x86#Instructions_SSE2) :
objdump -d /usr/lib/pyshared/python2.7/numpy/core/*.so | grep -i ADDPD
if you get some results with the grep it means that the binary has been compiled with the SSE flags.
回答3:
First, find the path of numpy/core .so files:
python -c "import numpy, os; print (os.path.join(os.path.dirname(numpy.__file__), 'core/*.so'))"
# Output: /usr/local/lib/python2.7/dist-packages/numpy/core/*.so
Then decompile numpy's core to file numpy.asm:
objdump -d /usr/local/lib/python2.7/dist-packages/numpy/core/*.so > numpy.asm
Then find all SSE instructions:
gawk 'BEGIN { IGNORECASE = 1 } /\<(ADDPS|ADDSS|ANDNPS|ANDPS|CMPPS|CMPSS|COMISS|CVTPI2PS|CVTPS2PI|CVTSI2SS|CVTSS2S|CVTTPS2PI|CVTTSS2SI|DIVPS|DIVSS|LDMXCSR|MAXPS|MAXSS|MINPS|MINSS|MOVAPS|MOVHLPS|MOVHPS|MOVLHPS|MOVLPS|MOVMSKPS|MOVNTPS|MOVSS|MOVUPS|MULPS|MULSS|ORPS|RCPPS|RCPSS|RSQRTPS|RSQRTSS|SHUFPS|SQRTPS|SQRTSS|STMXCSR|SUBPS|SUBSS|UCOMISS|UNPCKHPS|UNPCKLPS|XORPS|PAVGB|PAVGW|PEXTRW|PINSRW|PMAXSW|PMAXUB|PMINSW|PMINUB|PMOVMSKB|PSADBW|PSHUFW)\>/' numpy.asm
Find all SSE2 instructions (except MOVSD and CMPSD, which were first introduced in 80386):
gawk 'BEGIN { IGNORECASE = 1 } /\<(ADDPD|ADDSD|ANDNPD|ANDPD|CMPPD|COMISD|CVTDQ2PD|CVTDQ2PS|CVTPD2DQ|CVTPD2PI|CVTPD2PS|CVTPI2PD|CVTPS2DQ|CVTPS2PD|CVTSD2SI|CVTSD2SS|CVTSI2SD|CVTSS2SD|CVTTPD2DQ|CVTTPD2PI|CVTPS2DQ|CVTTSD2SI|DIVPD|DIVSD|MAXPD|MAXSD|MINPD|MINSD|MOVAPD|MOVHPD|MOVLPD|MOVMSKPD|MOVUPD|MULPD|MULSD|ORPD|SHUFPD|SQRTPD|SQRTSD|SUBPD|SUBSD|UCOMISD|UNPCKHPD|UNPCKLPD|XORPD|MOVDQ2Q|MOVDQA|MOVDQU|MOVQ2DQ|PADDQ|PMULUDQ|PSHUFHW|PSHUFLW|PSHUFD|PSLLDQ|PSRLDQ|PUNPCKHQDQ|PUNPCKLQDQ)\>/' numpy.asm
Find all SSE3 instructions:
gawk 'BEGIN { IGNORECASE = 1 } /\<(ADDSUBPD|ADDSUBPS|HADDPD|HADDPS|HSUBPD|HSUBPS|MOVDDUP|MOVSHDUP|MOVSLDUP|LDDQU|FISTTP)\>/' numpy.asm
Find all SSSE3 instructions:
gawk 'BEGIN { IGNORECASE = 1 } /\<(PSIGNW|PSIGND|PSIGNB|PSHUFB|PMULHRSW|PMADDUBSW|PHSUBW|PHSUBSW|PHSUBD|PHADDW|PHADDSW|PHADDD|PALIGNR|PABSW|PABSD|PABSB)\>/' numpy.asm
Find all SSE4 instructions:
gawk 'BEGIN { IGNORECASE = 1 } /\<(MPSADBW|PHMINPOSUW|PMULLD|PMULDQ|DPPS|DPPD|BLENDPS|BLENDPD|BLENDVPS|BLENDVPD|PBLENDVB|PBLENDDW|PMINSB|PMAXSB|PMINUW|PMAXUW|PMINUD|PMAXUD|PMINSD|PMAXSD|ROUNDPS|ROUNDSS|ROUNDPD|ROUNDSD|INSERTPS|PINSRB|PINSRD|PINSRQ|EXTRACTPS|PEXTRB|PEXTRD|PEXTRW|PEXTRQ|PMOVSXBW|PMOVZXBW|PMOVSXBD|PMOVZXBD|PMOVSXBQ|PMOVZXBQ|PMOVSXWD|PMOVZXWD|PMOVSXWQ|PMOVZXWQ|PMOVSXDQ|PMOVZXDQ|PTEST|PCMPEQQ|PCMPGTQ|PACKUSDW|PCMPESTRI|PCMPESTRM|PCMPISTRI|PCMPISTRM|CRC32|POPCNT|MOVNTDQA|EXTRQ|INSERTQ|MOVNTSD|MOVNTSS|LZCNT)\>/' numpy.asm