I'm implementing binary translation and have to deal with sequences of NOPs (0x90) with length about 16 opcodes. Is it better for performance to place JMP (to the end) at start of such sequences?
相关问题
- Null-terminated string, opening file for reading
- What's the difference between 0 and dword 0?
- Translate the following machine language code (0x2
- Where can the code be more efficient for checking
- How can I include a ASM program into my Turbo Basi
相关文章
- How to generate assembly code with gcc that can be
- Select unique/deduplication in SSE/AVX
- Optimising this C (AVR) code
- Why does the latency of the sqrtsd instruction cha
- Difference in ABI between x86_64 Linux functions a
- x86 instruction encoding tables
- Why doesn't there exists a subi opcode for MIP
- Tool to Debug Guest OS in Virtual Box
being a binary translation I would start by translating (them into equivalent nops on the target system). Once things are working then optimize out dead code. At the same time since this string of instructions caught your eye, try to understand what they were there for, perhaps waiting on hardware to do something, and make sure that your translated system functions the same.
If the
NOP
s are to align the stream, then they have more value than just being a NO OP. if your concerned with pure speed, see Agner Fog's Optimization Manuals Vol. 4.The Intel Architecture Software developer's guide, volume 2B (instructions N-Z) contains the following table (pg 4-12) about
NOP
:Table 4-9. Recommended Multi-Byte Sequence of NOP Instruction
This allows you to construct "padding
NOP
" of certain sizes. With two of those, you can bridge 16 Bytes, although I second the suggestion to check the optimization guides (for the CPU you're targeting) whether aJMP
is faster than two suchNOPs
.