Do I need to keep the padding bits?
The MSP430X architecture is an extension of the 16-bit MSP430 architecture to a 20-bit address space. This is done by expanding the processor registers to 20 bits, keeping the least addressable unit in one octet ( CHAR_BIT
equal to 8).
In this architecture, one could think of implementing a C programming language that provides a 20-bit integer type for int
, using 8-bit char
, 16-bit, short
and emulated 32-bit long
. Since 20 is not a multiple CHAR_BIT
, some padding bits are required when storing the type variable int
. For example, it can be stored int
in four bytes, leaving one byte and four bits of the other byte as padding.
After reading what the standard says about bit padding in integer types, I'm not sure how they should behave. Since in this case the indents exist only for storage, their value cannot be set and not respected, except as punning. And even then, copying an object of that 20-bit type does not copy any padding bits. Is this kind of bits allowed by ISO 9899: 2011?
source to share
The C standard does not require padding bits to be copied using assignment. Assignment is specified in terms of values, not views.
N1570 6.2.6.2p5 says:
The meanings of any padding bits are undefined.
This is an unqualified operator, implying that they are undefined in all circumstances even after being assigned from a bit-set object.
By itself, this operator can be considered vague enough that it does not firmly state that padding bits are not necessarily copied.
The padding bits do not contribute to the representation of the integer object. The footnote to the quoted sentence says:
All other padding bit combinations are alternative objects to represent the value specified by the value bits.
("Other" refers to trap views.)
6.5.16.1p2, describing a simple assignment, says:
In simple assignment (
=
), the value of the correct operand is converted to the type of the assignment expression and replaces the stored value in the object indicated by the left operand.
The description is presented in terms of values ββthat do not represent; do not imply that the RHS representation must be supported in the LHS entity. And, of course, the RHS in a job can be an arbitrary expression, not just an object reference. Even if it is just an object name, it is subject to the lvalue conversion described in 6.3.2.1p2; this conversion only applies to the value of the object, not its representation.
(Elsewhere, the standard says that passing parameters, passing arguments to a function, and returning a value from a function behave like a simple assignment.)
source to share
In general, the standard places some restrictions on the type of sizeof
a. The main limitation is that it must be a multiple of char
c sizeof(char)
, defined as 1
.
For the padding of bits within a type, see 6.2.6.1 , which leaves the representation mostly implemented. 6.2.6.2p5 asserts that the meaning of the padding bits is not specified; there is no need to save, but there are two important restrictions on padding bits:
- A positive value in an integer value must represent the same value of the same unsigned type. This ensures compatibility between signed and unsigned variants of the same type for positive values ββwithin the range of the signed variant.
- If all bits are zero, that means a value
0
. So all padding bits must be 0. However, the reverse is not true (thanks to MattMcNabb).
Both include padding bits as they are part of the internal representation. From a more practical representation, padding bits should be set to zero if there is no parity, etc. Bits that depend on other bits (but the second constraint must be met).
This is a rough interpretation. See the rest of the sections listed for more information.
In the MSP430X, 20 bits is int
not very practical. They are mainly intended for expanding the addressing range and not for integer arithmetic (although the instruction set seems to support it - I was wrong here in a previous edit).
Pointers are sizeof
32 bits (4 8-bit bytes) but only use 20 bits. Some built-in compilers may support the special qualifiers short
/ near
/ ..., effectively providing two different pointer sizes. This is - however - actually against the standard. (I'm a bit ambivalent here: optimization versus portability).
The MSP430X is one of the platforms where using the separated types from stdint.h
( uintptr_t
) and stddef.h
(for example size_t
) is important as casting a pointer to / from int
will ultimately fail. Moreover, the standard requirements for (u)intptr_t
(temporary storage, no operations) are becoming clear. So there is no guarantee about padding bits - even for a null pointer.
The reason for this large overhead (37.5% unused bits) is that the MSP430X has no functionality to read / write 20-bit or even 24-bit values ββ(and that would make it very expensive to index the array) to / from memory. Only some constants can be 20 bits, because they are encoded in the instruction using an extension word that includes 4 bits, and the remaining 16 bits, like other instructions, follow the OP code. This is probably one of the more recent (small) architectures to show how much extra effort is needed to expand the address space while maintaining compatibility.
Note that the MSP430X has some additional bugs for 20-bit addressing modes. For example, interrupt handlers should be in the lower 64KiB, since the vector table contains only 16 bits of entries. This effectively disallows defining a vet table in C as an array of function pointers (since they cannot be freely converted to any other function pointer and back).
source to share