August 23rd, 2014

Bitfields

Today we’re going to talk a little about the little-used bitfield feature of the C language. Sometimes, a programmer wants to define a structure that represents a bunch of bits; in my case, it’s the IPv4 header. It looks like this:

0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version|  IHL  |Type of Service|          Total Length         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|         Identification        |Flags|      Fragment Offset    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Time to Live |    Protocol   |         Header Checksum       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Source Address                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Destination Address                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Options                    |     Padding   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

As you can see, there’s this funny 13-bit bitfield called fragment offset, which is used when chopping up a packet into fragments for transmission on links with small maximum transmission unit (MTU) settings.  Now, a professional C programmer who hasn’t run into this problem might say this:

typedef IPV4_HDR {
uint16_t fragment_offset:13;
uint16_t  flags:3;
} IPV4_HDR;

The fields are swapped because we assume a little-endian machine for this example, and there’s a rule saying you have to define the fields backwards within a machine word on little-endian compilers.  What you get in this case isn’t what you expect:

0                   1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Frag. Off 1  |Flags|Frg Off 2|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

What happened?  Well, we did get 13 bits of the fragment offset, and 3 bits of flags, but the fragment offset field is backwards.  It’s… byte swapped!  On a big-endian machine, you get what you’d expect:

0                   1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Flags|     Fragment Offset     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The various C standards all cop out and say that what really happens with bitfields is implementation defined, and you should go consult your API documentation to figure out how the bitfields will be packed.  But GCC cops out too, and says that you should go consult your ABI.  Of course, the System V ABI for i386 doesn’t even include the word “bitfield”.  The Visual Studio compiler documentation doesn’t address this issue either.
So it seems that on the x86, it’s not possible to create a bitfield greater that 8 bits and have it pack correctly.  The solution?  Define the fragment offset and flags as a single 16 bit quantity and use macros to do the bit setting yourself:

#define SET_IP_FO(_IPV4_HDR_, _X_) (_IPV4_HDR_.ip_fo = _X_ & 0x1fff)

Leave a Response

Imhotep theme designed by Chris Lin. Proudly powered by Wordpress.
XHTML | CSS | RSS | Comments RSS