Oracle released Sun Studio 12.6 recently. We have a SHA-1 and SHA-256 intrinsic based implementation (for ARM and Intel), and we want to enable the extension on Solaris i86 machines.
The 12.6 manual and -xarch
options is available at A.2.115.3 -xarch Flags for x86, but it does not discuss SHA.
Which -xarch
option do we use for SHA?
If Studio 12.6 doesn't support the SHA instruction set (and I strongly suspect it doesn't since I can't find "SHA" mentioned at all, in any form, in the What's New in the Oracle Developer Studio 12.6 Release documentation), you're out of luck.
Almost.
What you can do is create your own inline assembler functions. See man inline
:
inline(4)
Name
inline, filename.il - Assembly language inline template files
Description
Assembly language call instructions are replaced by a copy of their
corresponding function body obtained from the inline template (*.il)
file.
Inline template files have a suffix of .il, for example:
% CC foo.il hello.c
Inlining is done by the compiler's code generator.
...
Examples
Please review libm.il or vis.il for examples. You can find a version of these libraries that is specific to each supported architecture under the compiler's lib/ directory.
...
An example can be found here (emphasis mine):
Performance Tuning With Sun Studio Compilers and Inline Assembly Code
...
This paper provides a demonstration of how to measure the performance
of a critical piece of code. An example using a compiler flag and
another example using inline assembly code are provided. The results are compared to show the benefits and differences of each
approach.
...
Example 8: Inline Assembly Code for the Iterative Mandelbrot Calculation
Knowing all these facts, the inline code can be written, as shown in
Example 8.
.inline mandel_il,0
// x is stored in %xmm0
// y is stored in %xmm1
// 4.0 is stored in %xmm2
// max_int is stored in %rdi
// set registers to zero
xorps %xmm3, %xmm3
xorps %xmm4, %xmm4
xorps %xmm5, %xmm5
xorps %xmm6, %xmm6
xorps %xmm7, %xmm7
xorq %rax, %rax
.loop:
// check to see if u2 - v2 > 4.0
movss %xmm5, %xmm7
addss %xmm6, %xmm7
ucomiss %xmm2, %xmm7
jp .exit
jae .exit
// v = 2 * v * u + y
mulss %xmm3, %xmm4
addss %xmm4, %xmm4
addss %xmm1, %xmm4
// u = u2 - v2 + x
movss %xmm5, %xmm3
subss %xmm6, %xmm3
addss %xmm0, %xmm3
// u2 = u * u
movss %xmm3, %xmm5
mulss %xmm3, %xmm5
// v2 = v * v
movss %xmm4, %xmm6
mulss %xmm4, %xmm6
incl %eax
cmpl %edi, %eax
jl .loop
.exit:
// end of mandel_il
.end
It's not hard at all. I had to write a lot of SPARC inline assembler functions for a customer I was consulting for back in the Solaris 8 days, some of them were pretty basic - effectively one-liners to wrap a single instruction. I swear some of them wound up in later versions of the Studio compiler suite (since we were sub-contracted by Sun itself, that's not surprising, nevermind the fact that some of them were blatantly obvious - floor()
and ceil()
, IIRC, were two of them - and should have been there in the first place...)