How to Test a Shellcode

A shellcode is a piece of compiled code that is typically given as input to a program that, when executed, is going to launch a shell (see Build Your Own Shellcode).

To test a shellcode we are going to used the following code:

    // filename: test_shellcode.c 
    char *code = "<shellcodegoeshere>";
    int main()
    {
        void (*shell)();
        shell=(void (*)())code;
        (*shell)(); 
    } 

In line 2, we are going to copy the shellcode that we want to test.
In line 5, we are declaring the variable shell as a pointer to a function that returns void and that takes no arguments.
In line 6, we are casting the string pointer code to the same type of the variable shell (i.e.: a function that returns void and that takes no arguments.)
In line 7, we are calling the function pointed by the shell variable (passing no parameters).

Once we filled the code variable with the shellcode (in line 1), we can compile the program and run it as:

    $ gcc -o test_shellcode test_shellcode.c
    $ ./test_shellcode

In this way we can understand if the shellcode will work on the current system. To give a shellcode in input to a program in order to execute it we have to be careful about few more things as explained in Build Your Own Shellcode.

Understanding the test program

The content pointed by the variable code is going to be allocated in an executable portion of the memory layout.

The command $ readelf -a ./test_shellcode gives a lot of information. Let’s try to brake it down for easy to digest.

If we examine the symbol tables $ readelf -s ./test_shellcode we can see something like this:

Symbol table '.symtab' contains 62 entries:
    Num:    Value          Size Type    Bind   Vis      Ndx Name
        0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
        1: 0000000000000238     0 SECTION LOCAL  DEFAULT    1 
        2: 0000000000000254     0 SECTION LOCAL  DEFAULT    2 
        3: 0000000000000274     0 SECTION LOCAL  DEFAULT    3
    .........
        59: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __cxa_finalize@@GLIBC_2.2
        60: 00000000000004d0     0 FUNC    GLOBAL DEFAULT   10 _init
        61: 0000000000201010     8 OBJECT  GLOBAL DEFAULT   22 code

In line 10, the symbol code is shown with a Value of 0000000000201010. The Value column represent the address of the symbol.

The command $ readelf -S ./test_shellcode shows the header sections of the ELF file:

Section Headers:
    [Nr] Name              Type             Address           Offset
        Size              EntSize          Flags  Link  Info  Align
    [ 0]                   NULL             0000000000000000  00000000
        0000000000000000  0000000000000000           0     0     0
    [ 1] .interp           PROGBITS         0000000000000238  00000238
        000000000000001c  0000000000000000   A       0     0     1
    [ 2] .note.ABI-tag     NOTE             0000000000000254  00000254
        0000000000000020  0000000000000000   A       0     0     4
......
    [21] .got              PROGBITS         0000000000200fc0  00000fc0
        0000000000000040  0000000000000008  WA       0     0     8 
    [22] .data             PROGBITS         0000000000201000  00001000
        0000000000000018  0000000000000000  WA       0     0     8
    [23] .bss              NOBITS           0000000000201018  00001018
        0000000000000008  0000000000000000  WA       0     0     1

Here, we can see that the .data section has an address of 0000000000201000 and a size of 0000000000000018. The symbol code is defined as the address of 0000000000201010 i.e., inside the .data section.

We could have gather the same information by running $ objdump -t ./test:

    SYMBOL TABLE:
    0000000000000238 l    d  .interp	0000000000000000              .interp
    0000000000000254 l    d  .note.ABI-tag	0000000000000000              .note.ABI-tag
    0000000000000274 l    d  .note.gnu.build-id	0000000000000000              .note.gnu.build-id
    0000000000000298 l    d  .gnu.hash	0000000000000000              .gnu.hash
    00000000000002b8 l    d  .dynsym	0000000000000000              .dynsym
    ......
    0000000000000000  w      *UND*	0000000000000000              _ITM_registerTMCloneTable
    0000000000000000  w    F *UND*	0000000000000000              __cxa_finalize@@GLIBC_2.2.5
    00000000000004d0 g     F .init	0000000000000000              _init
    0000000000201010 g     O .data	0000000000000008              code

In line 11, we can see that the symbol code is declared in the section .data.

To know the permission that a section has, we can have to look at the Program Headers and at the Section to Segment mapping by running $ readelf -a ./test_shellcode:

Program Headers:
    Type           Offset             VirtAddr           PhysAddr
                    FileSiz            MemSiz              Flags  Align
    PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
                    0x00000000000001f8 0x00000000000001f8  R      0x8
    INTERP         0x0000000000000238 0x0000000000000238 0x0000000000000238
                    0x000000000000001c 0x000000000000001c  R      0x1
        [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
    LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                    0x0000000000000818 0x0000000000000818  R E    0x200000
    LOAD           0x0000000000000df0 0x0000000000200df0 0x0000000000200df0
                    0x0000000000000228 0x0000000000000230  RW     0x200000
    DYNAMIC        0x0000000000000e00 0x0000000000200e00 0x0000000000200e00
                    0x00000000000001c0 0x00000000000001c0  RW     0x8
    NOTE           0x0000000000000254 0x0000000000000254 0x0000000000000254
                    0x0000000000000044 0x0000000000000044  R      0x4
    GNU_EH_FRAME   0x00000000000006d4 0x00000000000006d4 0x00000000000006d4
                    0x000000000000003c 0x000000000000003c  R      0x4
    GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                    0x0000000000000000 0x0000000000000000  RW     0x10
    GNU_RELRO      0x0000000000000df0 0x0000000000200df0 0x0000000000200df0
                    0x0000000000000210 0x0000000000000210  R      0x1

    Section to Segment mapping:
    Segment Sections...
    00     
    01     .interp 
    02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version
            .gnu.version_r .rela.dyn .init .plt .plt.got .text .fini .rodata .eh_frame_hdr
            .eh_frame 
    03     .init_array .fini_array .dynamic .got .data .bss 
    04     .dynamic 
    05     .note.ABI-tag .note.gnu.build-id 
    06     .eh_frame_hdr 
    07     
    08     .init_array .fini_array .dynamic .got

Here we can see that the .data section is mapped into the segment 03 (i.e., the 4th segment).

Under the “Program Header” we can count the segments from the top; the first is PHDR segment, the second is INTERP and the fourth is LOAD with permission to read and to write (but not execute!).

If we don’t have the execute permission, how come that we are able to run the code??

Let’s run the code in GDB to clarify this point (I am using GEF):

    Reading symbols from ./test_shellcode...(no debugging symbols found)...done.
    gef➤  b main
    Breakpoint 1 at 0x61e
    gef➤  r
    Starting program: /home/pippo/ctf/lectures/build_shellcode/test
    Breakpoint 1, 0x000055555555461e in main ()
    gef➤  info address code
    Symbol "code" is at 0x555555755010 in a file compiled without debugging.
    gef➤  x/1g 0x555555755010
    0x555555755010 <code>:	0x5555555546c4
    gef➤  x/1g 0x5555555546c4
    0x5555555546c4:	0x5bf0000003cb8

In line 7, we are printing the address of the global variable code that is located 0x555555755010 (as shown in line 8). This variable is a pointer to the location of memory 0x5555555546c4 (line 10).

To understand what are the permission of those different memory locations we can run gef➤ vmmap:

    gef➤  vmmap 
    Start              End                Offset             Perm Path
    0x0000555555554000 0x0000555555555000 0x0000000000000000 r-x /home/pippo/ctf/lectures/build_shellcode/test_shellcode
    0x0000555555754000 0x0000555555755000 0x0000000000000000 r-- /home/pippo/ctf/lectures/build_shellcode/test_shellcode
    0x0000555555755000 0x0000555555756000 0x0000000000001000 rw- /home/pippo/ctf/lectures/build_shellcode/test_shellcode
    0x00007ffff79e4000 0x00007ffff7bcb000 0x0000000000000000 r-x /lib/x86_64-linux-gnu/libc-2.27.so
    0x00007ffff7bcb000 0x00007ffff7dcb000 0x00000000001e7000 --- /lib/x86_64-linux-gnu/libc-2.27.so
    0x00007ffff7dcb000 0x00007ffff7dcf000 0x00000000001e7000 r-- /lib/x86_64-linux-gnu/libc-2.27.so
    0x00007ffff7dcf000 0x00007ffff7dd1000 0x00000000001eb000 rw- /lib/x86_64-linux-gnu/libc-2.27.so
    0x00007ffff7dd1000 0x00007ffff7dd5000 0x0000000000000000 rw- 
    0x00007ffff7dd5000 0x00007ffff7dfc000 0x0000000000000000 r-x /lib/x86_64-linux-gnu/ld-2.27.so
    0x00007ffff7fcd000 0x00007ffff7fcf000 0x0000000000000000 rw- 
    0x00007ffff7ff7000 0x00007ffff7ffa000 0x0000000000000000 r-- [vvar]
    0x00007ffff7ffa000 0x00007ffff7ffc000 0x0000000000000000 r-x [vdso]
    0x00007ffff7ffc000 0x00007ffff7ffd000 0x0000000000027000 r-- /lib/x86_64-linux-gnu/ld-2.27.so
    0x00007ffff7ffd000 0x00007ffff7ffe000 0x0000000000028000 rw- /lib/x86_64-linux-gnu/ld-2.27.so
    0x00007ffff7ffe000 0x00007ffff7fff000 0x0000000000000000 rw- 
    0x00007ffffffde000 0x00007ffffffff000 0x0000000000000000 rw- [stack]
    0xffffffffff600000 0xffffffffff601000 0x0000000000000000 r-x [vsyscall]

(gef➤ vmmap is exactly the same of running cat /proc/{process_id}/maps, where process_id is the process of the debugged program. To obtain the process id of the currently running debugged program run info inferior.)

In lines 3,4 and 5 we can see that there are 3 memory regions that map the test_shellcode executable with different permissions.

We can see that the variable code (0x555555755010) is located in a memory region that has read and write permissions, in line 5. This is the same information that we obtained previously by reading the ELF file (with the readelf command).

We can also see that the address pointed by the variable code (0x5555555546c4) is located in a memory region that has read and execute permission, in line 3.

Alternative test code

Sometimes over the Internet you see code like this:

    int main(){
        char code[]= "\xb8\x3c\x00\x00\x00\xbf\x05\x00\x00\x00\x0f\x05";
        void (*shell)();
        shell=(void (*)())code;
        (*shell)(); //shell();
    } 

Here, the code variable is declared inside the main function and it is NOT a pointer.

This code WILL NOT work if compiled with:

    $ gcc -o test_shellcode ./test_shellcode.c

The code will compile correctly but it will produce Segmentation fault when executed. The reason is simple, the code variable is located in the stack and since it is a non-executable memory region, if an instruction tries to execute from here the system will produce a segmentation fault.

This code WILL work if compiled by passing the flag to make the stack executable:

    $ gcc -o test_shellcode -z execstack ./test_shellcode.c

(Alternatively we can also use the execstack tool to change the executable stack flag of an ELF file.)