Learning that you can use unions in C for grouping things into namespaces

July 31, 2021

I've done a reasonable amount of programming in C and so I like to think that I know it reasonably well. But still, every so often I learn a new C trick and in the process learn that I didn't know C quite as well as I thought. Today's trick is using C unions to group struct fields together, which I learned about through this LWN article (via).

Suppose that you have a struct with a bunch of fields, and you want to deal with some of them all together at once under a single name; perhaps you want to conveniently copy them as a block through struct assignment. However, you also want to keep simple field access and for people using your struct to not have to know that these fields in particular are special (among other things, this might keep the grouping from leaking into your API). The traditional old school C approach to this is a sub-structure with #defines on top:

struct a {
  int field1;
  struct {
    int field_2;
    int field_3;
  } sub;
};

#define field2 sub.field_2
#define field3 sub.field_3

(Update: corrected the code here to be right. It's been a while since I wrote C.)

One of the problems with this is the #defines, which have very much fallen out of favour as a way of renaming fields.

It turns out that modern C lets you do better than this by abusing unions for namespace purposes. What you do is that you embed two identical sub-structs inside an anonymous union, with the same fields in each, and give one sub-struct a name and keep the other anonymous. The anonymous sub-struct inside the anonymous union lets you access its fields without any additional levels of names. The non-anonymous struct lets you refer to the whole thing by name.

Like so:

struct a {
  int field1;
  union {
    struct {
      int field2;
      int field3;
    };
    struct {
      int field2;
      int field3;
    } sub;
  };
};

You can access both a.field2 and a.sub, and a.field2 is the same as a.sub.field2.

Naturally people create #define macros to automate creating this structure so that all fields stay in sync between the two structs inside the union. Otherwise this "clever" setup is rather fragile.

(I think this may be a well known thing in the modern C community, but I'm out of touch with modern C for various reasons, especially perverse modern C. This is definitely perverse.)


Comments on this page:

This is interesting, and I'll keep it in mind for the next time I argue with C language programmers who see nothing wrong with using vile hacks to poorly approximate features better languages actually support, and well.

By Alex Shpilkin at 2021-08-02 02:29:26:

While this particular idea is indeed perverse, I’m not sure it’s entirely fair to describe as being exotic due to use of “modern C”: Thompson’s paper “A new C compiler” introducing the Plan 9 toolchain mentions anonymous struct and union members among the deviations from ANSI and specifically calls them out as “the most important and most heavily used of the extensions”. (In fact his version of the feature goes even further, permitting things like struct foo; for pulling in an already-defined structure type, but the syntax around that part is admittedly rather funky.) The description seems to imply that he came up with the idea himself, but whether he was the first to do so is anyone’s guess.

So anonymous struct and union members are about as old as standard C itself, and while the fact that it took twenty years and two revisions of the standard for the community to agree that they are a good idea is rather sad, it doesn’t make the features themselves particularly “modern” in my view.

Written on 31 July 2021.
« Hardware and (Linux) driver quality can be invisible to non-specialists
Unix APIs are where I first saw C #define used to rename struct fields »

Page tools: View Source, View Normal.
Search:
Login: Password:

Last modified: Sat Jul 31 21:09:09 2021
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.