可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Pardon me if you feel this has been answered numerous times, but I need answers to the following queries!

Why data has to be aligned (on 2-byte / 4-byte / 8-byte boundaries)? Here my doubt is when the CPU has address lines Ax Ax-1 Ax-2 ... A2 A1 A0 then it is quite possible to address the memory locations sequentially. So why there is the need to align the data at specific boundaries?
How to find the alignment requirements when I am compiling my code and generating the executable?
If for e.g the data alignment is 4-byte boundary, does that mean each consecutive byte is located at modulo 4 offsets? My doubt is if data is 4-byte aligned does that mean that if a byte is at 1004 then the next byte is at 1008 (or at 1005)?

回答1:

CPUs are word oriented, not byte oriented. In a simple CPU, memory is generally configured to return one word (32bits, 64bits, etc) per address strobe, where the bottom two (or more) address lines are generally don't-care bits.

Intel CPUs can perform accesses on non-word boundries for many instructions, however there is a performance penalty as internally the CPU performs two memory accesses and a math operation to load one word. If you are doing byte reads, no alignment applies.

Some CPUs (ARM, or Intel SSE instructions) require aligned memory and have undefined operation when doing unaligned accesses (or throw an exception). They save significant silicon space by not implementing the much more complicated load/store subsystem.

Alignment depends on the CPU word size (16, 32, 64bit) or in the case of SSE the SSE register size (128 bits).

For your last question, if you are loading a single data byte at a time there is no alignment restriction on most CPUs (some DSPs don't have byte level instructions, but its likely you won't run into one).

回答2:

Very little data "has" to be aligned. It's more that certain types of data may perform better or certain cpu operations require a certain data alignment.

First of all, let's say you're reading 4 bytes of data at a time. Let's also say that your CPU has a 32 bit data buss. Let's also say your data is stored at byte 2 in the system memory.

Now since you can load 4 bytes of data at once, it doesn't make too much sense to have your Address register to point to a single byte. By making your address register point to every 4 bytes you can manipulate 4 times the data. So in other words your CPU may only be able to read data starting at bytes 0, 4, 8, 12, 16, etc.

So here's the issue. If you want the data starting at byte 2 and you're reading 4 bytes, then half your data will be in address position 0 and the other half in position 1.

So basically you'd end up hitting the memory twice to read your one 4 byte data element. Some CPUs don't support this sort of operation (or force you to load and combine the two results manually).

Go here for more details: http://en.wikipedia.org/wiki/Data_structure_alignment

回答3:

1.) Some architectures do not have this requirement at all, some encourage alignment (there is a speed penalty when accessing non-alignet data items), and some may enforce it strictly (misaligment causes a processor exception).
Many of todays popular architectures fall in the speed penalty category. The CPU designers had to make a trade between flexibility/performance and cost (silicon area/number of control signals required for bus cycles).

2.) What language, which architecture? Consult your compilers manual and/or the CPU architecture documentation.

3.) Again this is totally architecture dependent (some architectures may not permit access on byte-sized items at all, or have bus widths which are not even a multiple of 8 bits). So unless you are asking about a specific architecture you wont get any useful answers.

回答4:

In general, the one answer to all three of those questions is "it depends on your system". Some more details:

Your memory system might not be byte-addressable. Besides that, you might incur a performance penalty to have your processor access unaligned data. Some processors (like older ARM chips, for example) just can't do it at all.
Read the manual for your processor and whatever ABI specification your code is being generated for,
Usually when people refer to data being at a certain alignment, it refers only to the first byte. So if the ABI spec said "data structure X must be 4-byte aligned", it means that X should be placed in memory at an address that's divisible by 4. Nothing is implied by that statment about the size or internal layout of structure X.

As far as your particular example goes, if the data is 4-byte aligned starting at address 1004, the next byte will be at 1005.

回答5:

Its completely depends on the CPU you are using!

Some architectures deal only in 32 (or 36!) bit words and you need special instructions to load singel characters or haalf words.

Some cpus (notably PowerPC and other IBM risc chips) dont care about alignments and will load integers from odd addresses.

For most modern architectures you need to align integers to word boundies and long integers to double word boundries. This simplifies the circutry for loading registers and speeds things up ever so slighly.

回答6:

Data alignment is required by CPU for performance reason. Intel website give out the detail on how to align the data in the memory

Data Alignment when Migrating to 64-Bit Intel® Architecture

One of these is the alignment of data items – their location in memory in relation to addresses that are multiples of four, eight or 16 bytes. Under the 16-bit Intel architecture, data alignment had little effect on performance, and its use was entirely optional. Under IA-32, aligning data correctly can be an important optimization, although its use is still optional with a very few exceptions, where correct alignment is mandatory. The 64-bit environment, however, imposes more-stringent requirements on data items. Misaligned objects cause program exceptions. For an item to be aligned properly, it must fulfill the requirements imposed by 64-bit Intel architecture (discussed shortly), plus those of the linker used to build the application.

The fundamental rule of data alignment is that the safest (and most widely supported) approach relies on what Intel terms "the natural boundaries." Those are the ones that occur when you round up the size of a data item to the next largest size of two, four, eight or 16 bytes. For example, a 10-byte float should be aligned on a 16-byte address, whereas 64-bit integers should be aligned to an eight-byte address. Because this is a 64-bit architecture, pointer sizes are all eight bytes wide, and so they too should align on eight-byte boundaries.

It is recommended that all structures larger than 16 bytes align on 16-byte boundaries. In general, for the best performance, align data as follows:

Align 8-bit data at any address

Align 16-bit data to be contained within an aligned four-byte word

Align 32-bit data so that its base address is a multiple of four

Align 64-bit data so that its base address is a multiple of eight

Align 80-bit data so that its base address is a multiple of sixteen

Align 128-bit data so that its base address is a multiple of sixteen

A 64-byte or greater data structure or array should be aligned so that its base address is a multiple of 64. Sorting data in decreasing size order is one heuristic for assisting with natural alignment. As long as 16-byte boundaries (and cache lines) are never crossed, natural alignment is not strictly necessary, although it is an easy way to enforce adherence to general alignment recommendations.

Aligning data correctly within structures can cause data bloat (due to the padding necessary to place fields correctly), so where necessary and possible, it is useful to reorganize structures so that fields that require the widest alignment are first in the structure. More on solving this problem appears in the article "Preparing Code for the IA-64 Architecture (Code Clean)."