DWARF2 contains .debug_frame information for helping debuggers figure out how to unwind frames (i.e. how to restore the stack to the previous frame from any instruction executing using the current frame.
Unfortunately there isn’t a lot of information online that actually shows (in a simple enough manner for someone new like me) to understand easily how it is decoded.
Below, I will attempt to provide information comprehensive enough to understand the .debug_frame section of the DWARF-2, it is a good thing that this section is meant to be stand alone, so not much other understanding is required.
**This is meant to be a hands on step-by-step introduction to .debug_frame, aimed at enabling the reader to start reading more complex ones after the example**
Things you will need:
- This article assumes understanding of general debugging knowledge such as ELF structure, various debugging sections of DWARF, using binutils etc.
- Standard document of DWARF-2 and DWARF-3
- On DWARF-2 standard, skim through Section 6.4: Call Frame Information (don’t worry if nothing make sense, just read it so you know what’s included)
- Open Appendix 5 of DWARF-2 standard and Appendix D.6 of DWARF-3 standard.
So we are ready to go, notice that Appendix 5 and Appendix D.6 of the two standards are really similar, and that is because for the most part they are the same.
Basic Idea
The basic idea of the .debug_frame is to allow you to figure out how to restore all the registers to the previous frame on the stack, from any instruction of the current frame. For example, consider the following assembly (with C code inline – objdump -S):
int testCall() { 4004cc: 55 push %rbp 4004cd: 48 89 e5 mov %rsp,%rbp int xyz = 25; 4004d0: c7 45 fc 19 00 00 00 movl $0x19,-0x4(%rbp) xyz++; 4004d7: 83 45 fc 01 addl $0x1,-0x4(%rbp) ........... <omitted> void main() { 4004e0: 55 push %rbp 4004e1: 48 89 e5 mov %rsp,%rbp 4004e4: 48 83 ec 10 sub $0x10,%rsp int g = testCall(); 4004e8: b8 00 00 00 00 mov $0x0,%eax 4004ed: e8 da ff ff ff callq 4004cc <testCall> ........... <omitted>
For example, when we are inside testCall (i.e. from 4004cc onward), at every instruction, the .debug_frame information should allow us to restore all registers (such as rbp, rsp, eax etc.) to their previous status in the caller’s frame (here the caller is main). Â This is done by keeping a table structure that keeps track of all the changes that happens with each register for every line of assembly. Referring to DWARF-2 Standard 6.4.1, the table structure is as follows:
LOC Â CFA Â Â R0 – RN
Where:
- LOC is the assembly address (for example 4004cc), possible values are the address of each instruction
- CFA is the current frame address (expressed by a register that serves as the SP/BP with an offset), more on this in the example below
- R0-RN are the registers that the particular architecture has, possible values are:
- Undefined, exactly what it says, the value of the register is whatever
- Same, the value of the register is the same as the previous row on the table
- Offset (N), the value of the register is now stored (as of this row’s LOC) at an offset N from CFA
- Register (R), the value of the register is now stored in a new register R
Example Analysis
From this point onward, we are ready to introduce the example in Appendix of the standard, and analyze it. We start by looking at the first portion of the example to make sure it make sense.
Worth noting is that in this example given by the standard, fsize = 12 (as we can clearly see that the stack goes up by 12 at the most)
;; start prologue foo   sub R7, R7,  fsize ; Allocate frame foo+4  store R1, R7, (fsize-4) ; Save the return address foo+8  store R6, R7, (fsize-8) ; Save R6 foo+12  add R6, R7, 0 ; R6 is now the Frame ptr foo+16  store R4, R6, (fsize-12) ; Save a preserve reg. ;; This subroutine does not change R5 ... ;; Start epilogue (R7 has been returned to entry value) foo+64  load R4, R6, (fsize-12) ; Restore R4 foo+68  load R6, R7, (fsize-8) ; Restore R6 foo+72  load R1, R7, (fsize-4) ; Restore return address foo+76  add R7, R7, fsize ; Deallocate frame foo+80  jump R ; Return foo+84
With table:
We will see that this makes sense:
- At LOC foo, nothing has happened yet (no instruction is run yet as the Program Counter is pointing to foo). The only thing that has been done, is that R8 (the PC) has been saved to R1 by the caller
- At foo+4, the instruction at location foo is run, making the current SP (R7) at R7-fsize. This means that the new CFA is now set to R7-fsize (See standard, CFA is the current frame address)
- At foo+8, the value of R1 is stored at R7-(fsize-4).
- Imagine with fsize = 12 and each data using 4 space, the current frame essentially has 3 slots. The top most one being fsize-4, middle is fsize-8, bottom most being fsize-12
- This means that the value of R1 (thus by proxy R8) is now stored on the stack at an offset of -4 from the current CFA (which is R7-fsize)
- At foo+16, see foo+12 moved the value R6 into R7 and so all CFA are now based on R6
The rest of the table pretty much follows the same thing so its easy to see.
Representation of the Table
So now we know how to track the table, we can see how to construct this table in the first place from .debug_frame. The reason why the table is not printed as it is is because this would be HUGE.
CIE:
This list shows how to construct the table structure. Read this in order:
- The CIE structure is 32 in length
- This CIE has no augmentation (cie+9 = 0)
- Code alignment factor is 4, so all location advancement needs /4
- Data alignment factor is 4, so all offset advancement needs *4
- The next one (cie+12) is the Register that holds the return address initially
- The rest of the cie structure can either be all nop to pad to the correct length, or they can be definitions to set up the first row of the table. See that the above example sets up the 1st row correctly
Once the table structure is set up, FDE describe how each new row is constructed. There is an FDE per function per compilation unit. (Usually only 1 CIE for each CU)
- Columns of new rows are initialized to the previous row first whenever a new row is added
- FDE+16, advance_loc(1) means insert a new row in the table
- FDE+17, def_cfa_offset means that the CFA column now has an offset at fsize/4 (4 being the data_alignment)
- FDE+18, add new row
- FDE+19, CFA_offset(8, 1), this means that register 8 is now stored at offset of 1*4 from CFA
- FDE+38, register 8 is restored to the initial value (first row of the table)
That’s it! This basically covers .debug_frame.