Home Extract variable names and offsets from a compound C struct

# Extract variable names and offsets from a compound C struct

alindber
1#
alindber Published in 2018-01-09 23:26:21Z
 I am working on a Wireshark plugin that must decode a large packet that is created by writing a struct into the packet. If I am able to know the variable names and offset of the original structure, I would then be able to decode the data without regard to the original compiler. The struct is large (> 650 bytes) and contains compound elements and typedefs. The construction of the struct changes based on the version of software generating the data. I have access to the raw header files and the compiler used in creating the software, thus I can create a framework to extract the detail I need for use in Wireshark. I have been successful in hand coding the decode for a few of the variables and offsets needed but the size and complexity of the structure require more automation than I can do by hand. Any suggestion on how to do this would be most welcome.
Daniel H
2#
Daniel H Reply to 2018-01-10 17:08:27Z
 Disclaimer: I haven’t used this myself beyond what you see here, especially not in complex situations; I’m only posting this as an answer because it’s better than the current lack-of-an-answer, but don’t be surprised if there are better options or critical information missing from this answer. It looks like all the information you need is available in the debug information when you compile with the -g flag. You can see it in close-to-raw format with objdump -Wi object_file.o  This isn’t the most useful format, and there are tools to analyze it for you. One of them is pahole (sources, LWN article, man page); I haven’t used it enough to recommend it, but it seems at least worth investigating further. It may be under dwarves or something in your package manager because it’s part of a collection of related tools for handling DWARF debug information. It seems to output exactly the information you want, with a few flags. It’s easier to parse the original C, but it still looks more like it’s intended for humans than for computer parsing; I haven’t found a flag that looks purely intended for machine parsing. Here’s an example pahole run: Input struct.c #include struct simple { int32_t i32; char c; uint64_t u64; } simple_object; struct complicated { struct simple nested_simple; union { float union_float; uint32_t union_int; } nested_union; struct { double d; char c; } nested_struct; char some_chars[5]; struct simple_packed nested_simple_packed; enum {COMPLICATED_ENUM1, COMPLICATED_ENUM2, COMPLICATED_ENUM3} enumeration; } complicated_object;  After running gcc -g structs.c -c -o structs.o pahole --expand_types structs.o  you get struct simple { /* typedef int32_t -> __int32_t */ int i32; /* 0 4 */ char c; /* 4 1 */ /* XXX 3 bytes hole, try to pack */ /* typedef uint64_t -> __uint64_t */ long unsigned int u64; /* 8 8 */ /* size: 16, cachelines: 1, members: 3 */ /* sum members: 13, holes: 1, sum holes: 3 */ /* last cacheline: 16 bytes */ }; struct complicated { struct simple { /* typedef int32_t -> __int32_t */ int i32; /* 0 4 */ char c; /* 4 1 */ /* XXX 3 bytes hole, try to pack */ /* typedef uint64_t -> __uint64_t */ long unsigned int u64; /* 8 8 */ } nested_simple; /* 0 16 */ union { float union_float; /* 4 */ /* typedef uint32_t -> __uint32_t */ unsigned int union_int; /* 4 */ } nested_union; /* 16 4 */ /* XXX 4 bytes hole, try to pack */ struct { double d; /* 24 8 */ char c; /* 32 1 */ } nested_struct; /* 24 16 */ char some_chars[5]; /* 40 5 */ /* XXX 3 bytes hole, try to pack */ enum { COMPLICATED_ENUM1 = 0, COMPLICATED_ENUM2 = 1, COMPLICATED_ENUM3 = 2, } enumeration; /* 48 4 */ /* size: 56, cachelines: 1, members: 5 */ /* sum members: 45, holes: 2, sum holes: 7 */ /* padding: 4 */ /* last cacheline: 56 bytes */ }; 
 You need to login account before you can post.
Processed in 0.374166 second(s) , Gzip On .