C Struct Layout Rules

The issue of the layout of C struct data type fields has cropped up a number of times recently, so it seems appropriate to document the apparent layout rules. This is important to developers who are using a language other than C to access netcdf-4 datatypes: python, or fortran.

These rules are taken from the HDF5 code. They are used in netcdf in ncgen4 and (the soon to be released) DAP->netcdf-4 translator.

The key to the layout is the notion of alignment. The alignment of a primitive data type (e.g. char, short, int, etc.) is the memory boundary on which all instances of the type should occur. As a rule, the alignment of a primitive type is equal to the sizeof(). Thus, the alignment of a char is 1, a short is 2, and so on. Note that the alignment of long depends on the machine. For 32-bit machines, it is 4 and for 64-bit machines the alignment of a long is 8.

|However, the above rule is not always correct.  For some machines, the alignment boundary may be smaller than the sizeof() function indicates. For example, on a SPARC, double values can be aligned on a 4-byte boundary instead of the expected 8-byte boundary. This means the alignment must be computed on a per-machine (though hopefully not on a per-compiler basis). To compute these true alignments, one must construct the following set of C structs.

|    struct S { char f1; T f2;}

|T ranges over all of the possible primitive types: char, short, int, float, double, etc. For each such struct, the value of the offsetof(S,f2) macro (from stddef.h) must be calculated and used as the alignment for type T.  The offset of a field in a C struct is the relative address of the field from the beginning of the struct, where the initial offset is zero. Thus, on a SPARC, offsetof(S,f2) when T = double is 4, whereas on a 64-bit X86 machine, offsetof(S,f2) when T = double is 8. This value is the alignment that must be used when computing struct offsets as defined below.

To test if a primitive type is properly aligned, the following should be true, where A is the address and alignment is the alignment of the primitive type.

 ((unsigned long)A) % alignment == 0 

Given this, the rules for layout of a C struct are as follows.

  1. The initial offset is zero
  2. Given a current offset, O, and a field F whose alignment is A, the offset of F is O + P, where P is the padding needed to be added to make sure that F is aligned to A. P is defined as
    (O % A == 0)?0:(A - (O % A)).      
  3. After adding field F, the offset is then O = O + P + A.
  4. One more rule is needed to complete the description. It appears that the alignment of a nested structure is the alignment of the most stringent field in the nested structure. "Stringent" effectively means the largest alignment.
  5. The size of a struct is the offset after the last field is added rounded up to a multiple of the most stringent field alignment.

More simply put, when adding a field, bump the offset until the offset is at the alignment required by the field.

Comments:

Yup.

A corollary to this is that, when creating a structure from scratch, order the members by decreasing alignment size in order to minimize unused space.

Posted by Steve Emmerson on March 30, 2009 at 03:51 PM MDT #

The alignment and padding vary according to compilers for various platform.

IMHO, instead of trying to optimize the layout to have the minimum memory usage for a particular compiler, I wound rather make the structure fits the chunk so the size would be constant cross the platforms, especially if you know this one could be referred by the languages other than C.

Posted by guan on May 13, 2009 at 02:01 PM MDT #

Post a Comment:
Comments are closed for this entry.
Unidata Developer's Blog
A weblog about software development by Unidata developers*
Unidata Developer's Blog
A weblog about software development by Unidata developers*

Welcome

FAQs

News@Unidata blog

Take a poll!

What if we had an ongoing user poll in here?

Browse By Topic
Browse by Topic
« April 2024
SunMonTueWedThuFriSat
 
2
3
4
5
6
7
8
9
10
11
12
13
14
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today