C Bit-Field Pitfalls(os2museum.com)
28 points by fanf2 23 hours ago | 8 comments
- kazinator 17 hours agoYou're supposed to know that bitfields undergo promotion as they were small integers, even if they are declared to be something like unsigned int.
Therefore, convert the values before operating on them:
u1 = ((uint64_t) bf.uf1) << 20; u2 = ((uint64_t) bf.uf2) << 20;[-]- tyfighter 16 hours agoI just can agree with any interpretation of the article's code that believes programmers should desire silent sign extension when everything about the expression and data types involved is explicitly written to avoid signedness. At the end of the day, programming languages should naturally express intent and not rely on memorization of surprise. Here, I believe that Microsoft correctly employed principle of least surprise, and that ultimately the spec is broken and because of the amount of code in existence just can't be fixed.
- pjmlp 22 hours agoIt is much safer to pack/unpack bits manually than trusting bitfields will work as expected.
- wahern 18 hours agoIt's even more confusing than described. C23 6.7.3.1p7 says,
> Each of the comma-separated multisets designates the same type, except that for bit-fields, it is implementation-defined whether the specifier `int` designates the same type as `signed int` or the same type as `unsigned int`.
That means a bit-field member using plain `int` as the underlying type might itself be signed or unsigned, similar to whether plain `char` is signed or unsigned. There's a proposal to address this: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3828.pdf Fortunately, it found that GCC, Clang, MSVC, and ICC all treat `int` bit-fields as `signed int`, and the recommendation is to require this behavior.
That said, I don't think I've ever seen a bit-field deliberately used for signed values; there's some logic to the allowance granted by the standard. And I'm sure there's plenty of real-world code erroneously using `int` bit-fields for unsigned values that just happens to work because of twos-complement representations and the semantics of bitwise operations. But better to limit this kind of flexibility, especially when real-world implementations seem not to have taken the alternative route.
- mwkaufma 22 hours agotl;dr standard is unclear if they should respect the signed-ness of the declaration (MSCV), or always promote to int before converting to a receiving type (GCC, Clang).
I suppose you could say MS's choice reflects a commitment to backwards compatibility, whereas GCC/Clang is always chomping at the bit to introduce more aggressive optimizations that signed-integer-undefined-behavior affords?
[-]- kazinator 17 hours agoI would say, GCC's behavior shows commitment to the standard. It's exactly he same logic as the promotion of char/signed char/unsigned char, or int and unsigned int.
That is to say, if we work this example with bitfields that are 8 bits wide, like
and use a shift of 24:uint32_t uf2 : 8; bf.uf2 = 255;
the behavior observed on GCC will not change if we edit the declaration of the member to unsigned char:u2 = bf.uf2 << 24;
I.e. an unsigned bitfield that is 8 bits wide is basically like unsigned char and promotes that way. From that the reasoning follows for other widths.unsigned char uf2;
- fsckboy 21 hours ago>The troublesome behavior is demonstrated by the lines performing the left shift. We take a 12-bit wide bit-field, shift it left by 20 bits so ...
this is nonsense. I don't know what they expect would happen, but who cares? I wouldn't shift a 12 bit field by more than ±11 bits.
you can shift the "enclosing" word of memory if you want, just put the original definition in a union.
[-]- kazinator 17 hours agoFor maximal portability, you wouldn't want to shift a 12 bit field by more than 3 bits. Because you know that is narrower than int, but int can be as narrow as 16 bits, which includes a sign bit.
If you assume that int is 32 bits, you can left shift an 12 bit field by 19 bits without hitting the sign bit.