Follow-up to extended discussion in Casting behavior in C
I'm trying to emulate a Z80 in C, where several 8-bit registers can be combined to create 16-bit registers.
This is the logic I'm trying to use:
struct {
uint8_t b;
uint8_t c;
uint16_t *bc;
} regs[1];
...
regs->bc = (uint16_t *)&(regs->b);
Why is this incorrect, and how can I do it correctly (using type-punning if needed)?
I need to do this multiple times, preferably within the same structure.
For those of you that I haven't mentioned this to: I understand that this assumes a little-endian architecture. I have this handled completely.
It's incorrect because
b
is of typeuint8_t
and a pointer touint16_t
cannot be used for accessing such a variable. It might not be correctly aligned and it is a strict aliasing violation.You are however free to do
(uint8_t *)®s
or(struct reg_t*)®s->b
, since (6.7.2.1/15)When doing hardware-related programming, make sure to never use signed types. That means changing
intn_t
touintn_t
.As for how to type pun properly, use a union:
You can then assign this to point at a 16 bit hardware register like this:
where
0x1234
is the hardware register address.NOTE: this union is endianess-dependent.
b
will access the MS byte ofbc
on big endian systems, but the LS byte ofbc
on little endian systems.The correct way is through anonymous unions in C as already shown in other answers. But as you want to process bytes, you may use the special handling of characters in the strict aliasing rule: whatever the type, is is always legal to use a char pointer to access the bytes of its representation. So this is conformant C
Interestingly enough, it is still valid for a C++ compiler...
To emulate a hardware register that can be accessed as two eight-bit registers or one 16-bit register, you can use:
Then
regs->bc
will be the 16-bit register, andregs->b
andregs->c
will be 8-bit registers.Note: This uses an anonymous
struct
so thatb
andc
appears as if they were members of the union. If thestruct
had a name, like this:then you would have to include its name when accessing
b
orc
, as withregs->s.b
. However, C has a feature that allows you to use a declaration without a name for this purpose.Also note this requires a C compiler. C allows using unions to reinterpret data. C++ has different rules.