Byte swap during copy

2020-07-07 11:25发布

问题:

I need to efficiently swap the byte order of an array during copying into another array.

The source array is of a certain type; char, short or int so the byte swapping required is unambiguous and will be according to that type.

My plan is to do this very simply with a multi-pass byte-wise copy (2 for short, 4 for int, ...). However are there any pre-existing "memcpy_swap_16/32/64" functions or libraries? Perhaps in image processing for BGR/RGB image processing.

EDIT

I know how to swap the bytes of individual values, that is not the problem. I want to do this process during a copy that I am going to perform anyway.

For example, if I have an array or little endian 4-byte integers I can do they swap by performing 4 bytewise copies with initial offsets of 0, 1, 2 and 3 with a stride of 4. But there may be a better way, perhaps even reading each 4-byte integer individually and using the byte-swap intrinsics _byteswap_ushort, _byteswap_ulong and _byteswap_uint64 would be faster. But I suspect there must be existing functions that do this type of processing.

EDIT 2

Just found this, which may be a useful basis for SSE, though its true that memory bandwidth probably makes it a waste of time.

Fast vectorized conversion from RGB to BGRA

回答1:

Unix systems have a swab function that does what you want for 16-bit arrays. It's probably optimized, but I'm not sure. Note that modern gcc will generate extremely efficient code if you just write the naive byte swap code:

uint32_t x, y;
y = (x<<24) | (x<<8 & 0xff0000) | (x>>8 & 0xff00) | (x>>24);

i.e. it will use the bswap instruction on i486+. Presumably putting this in a loop will give an efficient loop too...

Edit: For your copying task, I would do the following in your loop:

  1. Read a 32-bit value from const uint32_t *src.
  2. Use the above code to swap it.
  3. Write a 32-bit value to uint32_t *dest.

Strictly speaking this may not be portable (aliasing violations) but as long as the copy function is in its own translation unit and not getting inlined, there's very little to worry about. Forget what I wrote about aliasing; if you're swapping the data as 32-bit values, it almost surely was actually 32-bit values to begin with, not some other type of pointer that was cast, so there's no issue.



回答2:

In linux, you should check the header bits/byteswap.h. there's a family of macros of the form bswap_##, and some of them use assembly instructions where appropriate.



回答3:

Yes there are existing functions like the one linked in the question but its not worth the effort because the size of the data (in this case) means the set up overhead is too high. So instead, it's better to just read out 2, 4, and 8 bytes at a time and do the swap using intrinsics and write back.



标签: c++ c memory