Skip to content

C Union, bitfields (aka. what is that colon in the variable name?)

So both are not commonly used features (except in embedded systems development or device drivers). I thought I would write about these two together, as they are related.

An example of a union first

union test
    int i;
    char c[4];

A union allocates enough memory to hold the largest type in the union, not all of the types (for example if this was a struct then there would be memory allocated to hold a int AND a char[4]). This means that only one of the data types can be assigned, for example is a union of a float and a int you can either use the int or the float but not both (since there is only one memory space allocated).

The usefulness of a union comes in the fact that you can access individual bytes in the allocated storage. For example with the union above, we can set the entire int at once, or we can set each of the byte of the int individually. For example:

#include <stdio.h>

typedef union test
    int i;
    char c[4];
} test;

int main()
    test t;
    // set a value
    t.i = 2000;
    // print the hex representation of the int
    printf("%08x\n", t.i);
    // set an individual byte of the int using c[4]
    t.c[3] = 0x3F;
    // print the hex representation of the int
    printf("%08x\n", t.i);

    return 0;


The above is very useful when trying to write to memory locations. For example a union of a DWORD could allow you to set a full DWORD at a time, or access individual WORD. A better formatted union could be (as seen on this stackoverflow article):

typedef union
    struct {
        unsigned char byte1;
        unsigned char byte2;
        unsigned char byte3;
        unsigned char byte4;
    } bytes;
    unsigned int dword;
} HW_Register;
HW_Register reg;

Bit field
Bit fields only exist in structs and unions, and allow you to specify the # of bits used to represent a integer type (signed/unsigned int, char). This can be useful for example when you are not storing large numbers in int type variables, and want to conserve some space. For example:

    int flag;
} t;

Now this uses 4 bytes of memory, but if you are only going to store 0 and 1 in the flag then its going to be wasting a lot of space. Here using the bit field can make your program allocate only what it needs:

    int flag:1; // means this will use only 1 bit
} t;

However, do a sizeof(t), and its still 4 bytes. This is due to alignment, and the real saving comes when you have multiple bit fields:

    int flag1:1; // means this will use only 1 bit
    int flag2:1;
    int flag3:2;
} t;

The above also uses 4 bytes, but contains 3 flags (where as if you used 3 ints it will be much bigger). Notice that you dont always need to use 1 bit storage as can be seen with flag3. Keep in mind that the compiler does not necessarily warn if you assign a value outside of the range of the bit field. For example t.flag3 = 20 will compile and run, however don’t expect the correct value stored in it.

Concluding thoughts
Combining union and bit fields is a very useful technique for accessing hardware registers. For example (thanks to the same stackoverflow article as above):

typedef union
    struct {
        unsigned char b1:1;
        unsigned char b2:1;
        unsigned char b3:1;
        unsigned char b4:1;
        unsigned char reserved:4;
    } bits;
    unsigned char byte;
} HW_RegisterB;
HW_RegisterB reg;

Posted in C/C++. Tagged with , , , , .

C initialize array inside struct

Another article on initialization is likely to follow, but this is going to serve as a temporary go-to article for initializing structs (and more or less the same for unions).


typedef struct test
    int i;
    char c[2][5];
} test;

This can be initialized using:

test t = {10, {{'a','b','c','d','\0'}, { 0 }}};
// or
test t = {10, {{'a','b','c','d','\0'}, {'x','y','z','v','\0'}}};

The first one says initialize i to 10, the first string to abcd(\0) and the 2nd to all 0s, and the second one does the same as before, except that for the 2nd string it initializes it to xyzv(\0).

When initializing the array, as long as at least one element is specified within {}, then all the omitted are set to 0. Thus {0} sets the first to 0, and the rest to 0. This is a important distinction, as for example you can try the following:

int i[5] = {20};
int j;

for (j=0; j<5; j++)
    printf("%d\n", i[j]);

The result is 20 0 0 0 0

Posted in C/C++. Tagged with , , , , , .

C variable storage classes: auto, static, register, extern

In C, variables can have a number of different storage classes (not to be confused with qualifiers like const), as the title of the article lists, they are:

  • auto
  • static
  • register
  • extern

These types dictates where the variables are located, their life time and how they are accessed. Each variable can only be assigned one storage class, so the code below is invalid:

extern static int i;

We will talk about them in turn below

These are the most common. All the variables defined in a code block are auto by default.

Auto variables are initialized to undefined values until a valid assignment (of that type). The keyword indicates that the variable can only be used within the current block since the variable will be automatically created and destroyed as it is needed. This also means that auto variables cannot be global.

void a(void) {
    int i;
    // equivalent to auto int i;

register class variables are only stored in registers instead of in memory. This also means that the variable can only be as big as a register, and do not have an address (no & operator). Note that this is only a hint to modern compilers that the variable will be used extensively. Compilers often will perform this optimization (and better than humans too).

Also note that the register keyword can only be used for local variables, this is because global or static variables, because global variables have static storage by default.

Static variables are initialized first during compilation, this means that the initialization must be a constant. They also persist after their scope )the current block of code) is done, and retain their value through calls.

Extern storage class defines a global variable visible to all object files when linking. Extern variables cannot be initialized since it is only pointing to the storage location of the actual variable. The file that contains the actual variable may initialize.

For example:

int i = 6;


extern int i;

Both files must be compiled together (actually if you only compile file1 is ok, but file2 require a link to file1)

There are a number of nuances, such as static implies internal linkage but extern implies external linkage, so global variables defined as

int i;

is not exactly the same as

static int i;

The first one has static storage (duration of variable) and external linkage (visibility of variable), where as the second has static storage and internal linkage (since static keyword controls both duration=static and visibility=internal).

We will touch on this in another article on linkage

Posted in C/C++. Tagged with , , , , , , , .

Arduino Expand tab to 4 spaces instead of 2

Amongst other annoying things, the Arduino IDE defaults to expand tabs with 2 spaces. To change that, go to (under Win7): User -> AppData -> Roaming -> Arduino

Open up preferences.txt and look for “editor.tabs.size=”, change that to 4 and restart arduino

According to this page:, you should only edit this file when Arduino is not running because Arduino IDE automatically restores the preferences on exit (um.. why the hell…)

Posted in Arduino. Tagged with , .

Python, Unicode and the Console (Windows console anyway) – UnicodeEncodeError

So today I was trying to do searches on a text file encoded in UTF-8. The code worked ok except that every time a print statement tried to print something not ASCII I get a error like the one below:

Traceback (most recent call last):
  File "", line 10, in <module>
    print line
  File "C:\Python27\lib\encodings\", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character u'\ufeff' in position
 0: character maps to <undefined>

After some digging, the reason became obvious. Python print will use whatever encoding that is default to the current window (in my case a Windows Command console). The default code page is actually cp437. This is also evident by the fact that the traceback says the error is a Encoding error from

What happens is that print will use the current default encoding obtained from sys.stdout.encoding, and since cp437 doesnt know how to encode the unicode character, it errors. This basically means you cant print any of the none ascii characters onto the console, will need to find some other way to print those out.

As a side note, to see the current encoding, do the following:

import sys

print sys.stdout.encoding

Posted in Python. Tagged with , , , .

C/C++, Enumerations

As it turns out (per written on the K&R book), there are a number of things to watch out for for Enumerations. Some are well known and trivia while some are not. I’m listing them all here for completeness sake, and also with some comments for a few of the points:

  1. Names in enums cannot clash (i.e. names must be distinct across all the enums)
  2. Values in enum do not need to be distinct
  3. First value in a enum default to 0 if unspecified
  4. Any name without a value will auto increment from the previously defined value
  5. Compilers will not check if the value assigned to the enum corresponds to a valid name

Point 1
For point 1, notice that there are a number of valid solutions:

Prefixing the names:

enum Color {cRed, cBlue};
enum Mood {mRed, mYellow};


Stuct ColorStruct { enum Color {Red, Blue} c };
Struct MoodStruct { enum Mood {Red, Yellow} m };

// In C
ColorStruct s;
s.c = cRed;

// In C++

Using namespace:

namespace Color { enum c {Red, Yello}; };

// Usable only in C++
// Or
using namespace Color;

Notice that this last approach has a problem in large codebases. Since you can use a namespace quite easily anywhere in a program, there is no guarantee that there won’t be a clash from another namespace

Point 2
Somewhat useful if you are trying to deal with multiple enum names the same way in a switch statement for example

#include <stdio.h>

// use typedef enum Color {..} Color; if you dont want to use enum Color varNam
// but want to do Color varName
enum Color {cWhite = 0, cBlack = 0, cGray = 1, cRed = 2, cBlue = 3};

int main() {
    enum Color c = 0;
    // can also use the label such as c = cBlack;

    switch(c) {
        case 0:
        case 1:
        case 2:
        case 3:

    return 0;

Point 5
For example if you set c = 5 from the code above, there will be no error. The compiler does not check to see if the value you assigned is actually valid.

In this case you will just get no output because the switch doesnt hit anything

Posted in C/C++. Tagged with , , .

C/C++/Java – Leading 0 (zero) means Octal

Accidentally prefixed a 0 in front of a variable, and as it turns out prefixing a integer value with 0 means it is an Octal.

Try this:

#include <stdio.h>

int a = 0134;

int main() {
    printf("%d", a);

    return 0;

The result is 92

This is the same across C/C++/Java

Posted in C/C++, Java. Tagged with , , , .

C/C++ – Variable and method name length limit

Original C dictates that

  • The first 31 chars of an internal (i.e. not an external variable) variable name is significant.
  • The first 6 characters of an function name or external variable is significant

For C99 this is 63 and 61 chars

For C++ (GCC 1024, MS 2048)

Posted in C/C++. Tagged with , .

Android: unable to resolve super class

I have ran into this before, this error usually means that you have a external jar file that is not loaded onto the Android device.

Make sure of the following:

  • The external jars are in a directory called libs inside the project
  • In project properties -> Java Build Path -> Order and Export, select the jars for export

Posted in Android. Tagged with , , .

Nexus 7 not showing up in ADB

Basically its because the debug driver isnt installed. I swear this has happened to every Android phone I have tried to develop on. I don’t really understand (but assume there is a reason) why google doesn’t just bundle the drivers or tell you to look for them.

Anyway, assuming you have enabled debug mode on the Android, if you look at Device manager there would be an unrecognized device called Nexus, you need to manually install the driver from here:

Posted in Android. Tagged with , , , .