Detect -xarch option in the preprocessor?

2019-07-15 05:16发布

I'm using Sun Studio 12.4 and 12.5 on Solaris 11. We have a source file that provides a straight C/C++ implementation of CRC32, or an optimized version of CRC32 using Intel intrinsics. At runtime, a function pointer is populated with the proper implementation.

Testing on a x86 server with dual Xeon's is producing the following because we are making code paths available based on compiler versions. SunCC 12.1 added support for SSE4 (if I parsed the matrix properly), so we attempt to enable it when __SUNPRO_CC >= 0x5100.

"crc.cpp", line 311: ube: error: _mm_crc32_u8 intrinsic requires at least -xarch=sse4_2.

SunCC does not define customary GCC defines, like __SSE4_1__ and __SSE4_2__. In addition, SunCC does not appear to make intrinsics available like MS VC++, where a compiler version indicates the support.

SunCC appears to enable features based on -xarch options, but its not clear to me how to detect it in the preprocessor. In addition, using -xarch sets some bits that cause the program to fail execute on downlevel processors (something like a "minimum" platform).

I have two questions.

  • how do I detect the -xarch option in the preprocessor?
  • how do I disable the -xarch bits so the program can run on down level processors?

Below is from a macro dump compiling with -xarch=aes. Notice there is nothing to indicate features available.

$ /opt/solarisstudio12.4/bin/CC -native -m64 -xarch=aes -xdumpmacros -E /dev/null 2>&1 | /usr/gnu/bin/sort --ignore-case

#1 "/dev/null"
#define __alignof__ __alignof
#define __amd64 1
#define __amd64__ 1
#define __ARRAYNEW 1
#define __asm asm
#define __asm__ asm
#define __attribute __attribute__
#define __builtin_constant_p __oracle_builtin_constant_p
#define __builtin_fpclassify __oracle_builtin_fpclassify
#define __builtin_huge_val __oracle_builtin_huge_val
#define __builtin_huge_valf __oracle_builtin_huge_valf
#define __builtin_huge_vall __oracle_builtin_huge_vall
#define __builtin_infinity __oracle_builtin_infinity
#define __builtin_isfinite __oracle_builtin_isfinite
#define __builtin_isgreater __oracle_builtin_isgreater
#define __builtin_isgreaterequal __oracle_builtin_isgreaterequal
#define __builtin_isinf __oracle_builtin_isinf
#define __builtin_isless __oracle_builtin_isless
#define __builtin_islessequal __oracle_builtin_islessequal
#define __builtin_islessgreater __oracle_builtin_islessgreater
#define __builtin_isnan __oracle_builtin_isnan
#define __builtin_isnormal __oracle_builtin_isnormal
#define __builtin_isunordered __oracle_builtin_isunordered
#define __builtin_nan __oracle_builtin_nan
#define __builtin_signbit __oracle_builtin_signbit
#define __BUILTIN_VA_STRUCT 1
#define __cplusplus 199711L
#define __DATE__ "Jul 11 2016"
#define __FILE__ 
#define __has_attribute(x) __oracle_has_attribute(x)
#define __has_nothrow_assign(x) __oracle_has_nothrow_assign(x)
#define __has_nothrow_constructor(x) __oracle_has_nothrow_constructor(x)
#define __has_nothrow_copy(x) __oracle_has_nothrow_copy(x)
#define __has_trivial_assign(x) __oracle_has_trivial_assign(x)
#define __has_trivial_constructor(x) __oracle_has_trivial_constructor(x)
#define __has_trivial_copy(x) __oracle_has_trivial_copy(x)
#define __has_trivial_destructor(x) __oracle_has_trivial_destructor(x)
#define __has_virtual_destructor(x) __oracle_has_virtual_destructor(x)
#define __is_abstract(x) __oracle_is_abstract(x)
#define __is_base_of(x,y) __oracle_is_base_of(x,y)
#define __is_class(x) __oracle_is_class(x)
#define __is_empty(x) __oracle_is_empty(x)
#define __is_enum(x) __oracle_is_enum(x)
#define __is_final(x) __oracle_is_final(x)
#define __is_literal_type(x) __oracle_is_literal_type(x)
#define __is_pod(x) __oracle_is_pod(x)
#define __is_polymorphic(x) __oracle_is_polymorphic(x)
#define __is_standard_layout(x) __oracle_is_standard_layout(x)
#define __is_trivial(x) __oracle_is_trivial(x)
#define __is_union(x) __oracle_is_union(x)
#define __LINE__ 
#define __LP64__ 1
#define __PRAGMA_REDEFINE_EXTNAME 1
#define __STDC__ 0
#define __sun 1
#define __SUN_PREFETCH 1
#define __SunOS 1
#define __SunOS_5_11 1
#define __SUNPRO_CC 0x5130
#define __SUNPRO_CC_COMPAT 5
#define __SVR4 1
#define __TIME__ "20:58:00"
#define __underlying_type(x) __oracle_underlying_type(x)
#define __unix 1
#define __volatile volatile
#define __volatile__ volatile
#define __x86_64 1
#define __x86_64__ 1
#define _BOOL 1
#define _LARGEFILE64_SOURCE 1
#define _LP64 1
#define _SIGNEDCHAR_ 1
#define _TEMPLATE_NO_EXTDEF 1
#define _WCHAR_T 
#define sun 1
#define unix 1

3条回答
劳资没心,怎么记你
2楼-- · 2019-07-15 05:35

First, you do not want to remove the instruction set flags from your compiled binaries. When you compile with an -xarch=NNNN option, the compile will use those instructions. If you try to run on a "lower" processor that doesn't implement the instructions from the architecture you provided in the -xarch argument, your binary stands a good chance of not working.

From the Solaris Studio 12.4: C User's Guide:

1.3 Binary Compatibility Verification

On Solaris systems, beginning with Solaris Studio 11, program binaries compiled with the Oracle Solaris Studio compilers are marked with architecture hardware flags indicating the instruction sets assumed by the compiled binary. At runtime, these marker flags are checked to verify that the binary can run on the hardware it is attempting to execute on.

Running programs that do not contain these architecture hardware flags on platforms that are not enabled with the appropriate features or instruction set extensions could result in segmentation faults or incorrect results occurring without any explicit warning messages.

Also note the mention of features as well as instruction sets. In my experience with Solaris documentation, that one little aside is enough of a warning that there is likely a whole lot more to

I don't know of any way to do detect the available instruction set via the preprocessor. You may be able to get help on the Oracle forum for Solaris Studio at https://community.oracle.com/community/server_%26_storage_systems/application_development_in_c__c%2B%2B__and_fortran/developer_studio_c_c%2B%2B_fortran_compilers

I suspect that even there, you won't find a way to use the preprocessor. The usual way of providing platform- and instruction-set specific implementations on Solaris is via specific shared objects. From the Solaris Linker and Libraries Guide:

Instruction Set Specific Shared Objects

The dynamic token $ISALIST is expanded at runtime to reflect the native instruction sets executable on this platform, as displayed by the utility isalist(1).

Any string name that incorporates the $ISALIST token is effectively duplicated into multiple strings. Each string is assigned one of the available instruction sets. This token is only available for filter or runpath specifications.

...

Or an application with similar dependencies is executed on an MMX configured Pentium Pro:

$ ldd -ls prog
.....
  find object=libbar.so.1; required by ./libfoo.so.1
    search path=/opt/ISV/lib/$ISALIST  (RPATH from file ./libfoo.so.1)
      trying path=/opt/ISV/lib/pentium_pro+mmx/libbar.so.1
      trying path=/opt/ISV/lib/pentium_pro/libbar.so.1
      trying path=/opt/ISV/lib/pentium+mmx/libbar.so.1
      trying path=/opt/ISV/lib/pentium/libbar.so.1
      trying path=/opt/ISV/lib/i486/libbar.so.1
      trying path=/opt/ISV/lib/i386/libbar.so.1
      trying path=/opt/ISV/lib/i86/libbar.so.1

Note how the library search starts with the "highest" instruction-set specific library, and moves to "lower" libraries. This allows for multiple instruction-set specific shared objects to be located, from "fastest specific" to "slowest generic". libc.so on Solaris does this to provide platform-specific versions of library functions such as memcpy().

查看更多
劳资没心,怎么记你
3楼-- · 2019-07-15 05:41

For your second question:

how do I disable the -xarch bits so the program can run on down level processors?

See Chapter 7 Capability Processing of the Linkers and Libraries Guide:

https://docs.oracle.com/cd/E53394_01/html/E54813/index.html

This shows you how to deliver multiple instances of the same function which are tagged with the capability bits. The runtime linker will resolve which function is used based on the reported capabilities.

If you really want to manage the capability bits yourself, see Chapter 9 Mapfiles in particular section CAPABILITY Directive. This shows how to remove capabilities from the generated object.

查看更多
该账号已被封号
4楼-- · 2019-07-15 05:41

I believe that for you particular situation (the second part of it) the only simple way to do what you want is this: compile with explicitly set "-xarch=sse4.2" (this allows the compiler to expand SSE4.2 intrinsics) and then strip off the HWCAP bits down to your minimal architecture (this makes your program runnable on pre-SSE4.2 hardware).

For stripping HWCAP see: https://docs.oracle.com/cd/E23823_01/html/816-5165/elfedit-1.html

(Example 2 Removing a Hardware Capability Bit)

查看更多
登录 后发表回答