Docker

One of the difficulties of creating a reverse engineering course like this one, in which course attendees are required to write C programs and then decompile them again is that everyone needs to have precisely the same environment (same GLIBC version, same compiler version, same GDB version, same OS, etc.) otherwise issues and inconsistencies will arise (for example, ASM on one machine looking vastly different to ASM on another machine).

In order to resolve this constraint, I settled on creating and distributing Dockerfiles for each lesson. Mostly because they're significantly more lightweight to distribute than a virtual machine, but they are also more portable and likely to run on equipment with lower specifications (consequently making this a more accessible and inclusive course.)

Instructions can be found here on how to install Docker, and an excellent tutorial for learning the basics of Docker can be found here.

Verify that the installation was successful with docker run hello-world, and then clean up again with docker rm $(docker ps -a -q -f status=exited); docker rmi $(docker images -a -q)

To actually run each lesson's Docker container, simply run docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined learnreverseengineering/lesson1 bash, replacing lesson1 with the appropriate lesson.

Folks with a background in security might feel a bit uncomfortable running their Docker containers with --security-opt seccomp=unconfined. This instruction came directly from the PwnDBG documentation. I've done some tinkering and I didn't identify any issues or inconsistencies in the course material by omitting that flag, so feel free to omit it if that makes you feel more comfortable!

GDB

In order to perform some reverse engineering, we're going to need some way of disassembling the compiled applications which we write as part of this course. As such, we'll use GDB because it's lightweight, runs anywhere and can be turned into a truly beautiful reverse engineering tool with the addition of PwnDBG / GEF / Peda. No local installation of GDB is going to be required, I've installed it and configured it inside of each of the Docker containers for this course. Some instructions on how to use GDB with PwnDBG can be found here

A GDB primer

Obviously not everyone who is taking this course has encountered GDB (or indeed any disassembler) so I felt it appropriate to include some basic commands here which will be used as part of the course

Opening an executable file

oliver@krankenhaus> gdb nameOfBinaryFile

Listing functions inside of the file

Inside of GDB, run info functions to list all functions in an unstripped binary -

        (gdb) info functions 
        All defined functions:

        Non-debugging symbols:
        0x0000000000001000  _init
        0x0000000000001030  printf@plt
        0x0000000000001040  __cxa_finalize@plt
        0x0000000000001050  _start
        0x0000000000001080  deregister_tm_clones
        0x00000000000010b0  register_tm_clones
        0x00000000000010f0  __do_global_dtors_aux
        0x0000000000001130  frame_dummy
        0x0000000000001139  main
        0x0000000000001170  __libc_csu_init
        0x00000000000011d0  __libc_csu_fini
        0x00000000000011d4  _fini

Running the executable inside of GDB

When GDB starts, the target executable is paused. It's possible to run the executable inside of GDB (useful for analyzing where it crashes, dynamically analyzing it etc.) with run -

        (gdb) run
        Starting program: /home/oliver/0xff-hello-world/0xff-hello-world 
        [Thread debugging using libthread_db enabled]
        Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
        learnreverseengineering.com is the best![Inferior 1 (process 524301) exited normally]

Setting/clearing breakpoints

Set a breakpoint to pause execution of the application a certain point with break FUNCTION_NAME or break *ADDRESS -

        (gdb) break main
        Breakpoint 1 at 0x113d
        (gdb) break *0x1139
        Breakpoint 2 at 0x1139
        (gdb) break *main+1
        Breakpoint 3 at 0x113a

Disassembling a function

Inside of GDB, run disassemble FUNCTION_NAME to disassemble a specific function -

        (gdb) disassemble main
        Dump of assembler code for function main:
        0x0000000000001139 <+0>:     push   rbp
        0x000000000000113a <+1>:     mov    rbp,rsp
        0x000000000000113d <+4>:     sub    rsp,0x10
        0x0000000000001141 <+8>:     mov    DWORD PTR [rbp-0x4],edi
        0x0000000000001144 <+11>:    mov    QWORD PTR [rbp-0x10],rsi
        0x0000000000001148 <+15>:    lea    rax,[rip+0xeb9]        # 0x2008
        0x000000000000114f <+22>:    mov    rdi,rax
        0x0000000000001152 <+25>:    mov    eax,0x0
        0x0000000000001157 <+30>:    call   0x1030 
        0x000000000000115c <+35>:    mov    eax,0x0
        0x0000000000001161 <+40>:    leave  
        0x0000000000001162 <+41>:    ret    
        End of assembler dump.

If execution is paused because of a breakpoint then it's possible to simply run disassemble to disassemble the current function that's executing.


    (gdb) b main
    Breakpoint 8 at 0x55555555513d
    (gdb) run
    Starting program: /home/oliver/0xff-hello-world/0xff-hello-world 
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
    
    Breakpoint 8, 0x000055555555513d in main ()
    (gdb) disassemble 
    Dump of assembler code for function main:
       0x0000555555555139 <+0>:     push   rbp
       0x000055555555513a <+1>:     mov    rbp,rsp
    => 0x000055555555513d <+4>:     sub    rsp,0x10
       0x0000555555555141 <+8>:     mov    DWORD PTR [rbp-0x4],edi
       0x0000555555555144 <+11>:    mov    QWORD PTR [rbp-0x10],rsi
       0x0000555555555148 <+15>:    lea    rax,[rip+0xeb9]        # 0x555555556008
       0x000055555555514f <+22>:    mov    rdi,rax
       0x0000555555555152 <+25>:    mov    eax,0x0
       0x0000555555555157 <+30>:    call   0x555555555030 
       0x000055555555515c <+35>:    mov    eax,0x0
       0x0000555555555161 <+40>:    leave  
       0x0000555555555162 <+41>:    ret    
    End of assembler dump.
    (gdb)

Observe that execution has paused at *main+4 as the breakpoint requested

Other GDB commands

We're going to make heavy use of GDB during this course, and we'll explore other commands as and when we come to them.

The C Language and GCC compiler

This course uses the C language for reverse engineering. The reasoning for this is that C is a very low level and lightweight language, which means that once it's disassembled it maps very easily back to the C language, because there are very few layers of abstraction between C and assembly.

I've written all of the C examples which will be used in this course, and I analyze/explain them heavily on the assumption that the reader isn't some kind of C language god/goddess. If you're not very well versed in the C language then don't worry, you can still follow along with this course without much trouble.

All of the code in this course is compiled with a very basic GCC configuration - essentially all of the code is compiled without optimization enabled and the binaries are not stripped (which means that they contain symbols, AKA function names/variable names.. things which make reverse engineering significantly easier when learning.

Again, there is no requirement to know any special GCC flags / syntax / internals, I'll always provide BASH scripts to do the compilation where necessary.

Prerequisites.